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^in - protein search, using sw model 



January 22, 2001, 11:50:43 ; Search time 233.01 Seconds 
(without alignments) 
223.791 Million cell updates/sec 

&S-Q9^540-245A-2 
B316 

1 MRGVGWQMLSLSLGLVLAIL SSFVDEVEKWKCGCTRCVS 1525 



Scoring table: BLOSOM62 
Gapop 10, 



Gapext 0.5 

parched: 268485 seqs, 34193795 residues j 
;al number of hits satisfying chosen parameters: 



268485 



Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

Database : A_Geneseq_36:* 

1 : /S IDSl/gcgdata/gen eseq/geneseqp/AAl 980 . DAI 

2: /SIDSl/gcgdata/geneseq/geneseqp/AAl981.DAT 

3 : /SIDSl/gcgdata/geneseq/genese'qp/AA1982 . DAT 

4 : /SIDSl/gcgdata/geneseq/geneseqp/AAl983 . DAT 

5 : /SIDS1/ gcgdata/geneseq/geneseqp/AAl 9 8 4 . DAT 

6: /SIDSl/gcgdata/geneseq/geneseqp/AA1985.DAT 

7 : /SIDSl/gcgdata/geneseq/geneseqp/AAl986 .DAT 

8 ; /SlDSl/gcgdata/geneseq/geneseqp/AAl987 .DAT 

9: ■-— 

10 
11 

12: 
13: 
14: 
15: 
16: 

w 

W 20: 

21 



/S I DS 1/gcgdata/genes eq/geneseqp/AAl 989 .DAT : * 
/SIDSl/gcgdata/geneseq/geneseqp/AA1990.DAT:* 
/SIDSl/gcgdata/geneseq/geneseqp/AA1991.DAT:* 
/S iDSl/gcgdata/ geneseq/geneseqp/AAl 9 92 . DAT : * 
/SIDSl/gcgdata/geneseq/geneseqp/AA1993 .DAT : * 
/SIDSl/gcgdata/geneseq/geneseqp/AA1994 .DAT : * 
/SIDSl/gcgdata/geneseq/geneseqp/AA1995 .DAT:* 
/SIDSl/gcgdata/geneseq/geneseqp/AAl996 .DAT:* 
/SIDSl/gcgdata/geneseq/geneseqp/AA1997 .DAT : * 
/SIDSl/gcgdata/geneseq/geneseqp/AA1998 .DAT:* 
/SlDSl/gcgdata/geneseq/geneseqp/AAl999 .DAT:* 
/SiDSl/gcgdata/geneseq/geneseqp/AA2000 .DAT:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



4 

Query 



NO. 


Score 


Match Length 


DB 


ID 


Description 


PR 
XX 
PA 


1 


8316 


100.0 


1525 


20 


Y17499 


Human Slit-1 prote 


2 


8265 


99.4 


1529 


20 


Y27145 


Human slit-2 prote 


XX 


3 


8265 


99.4 


1529 


20 


W96702 


Full length slit-1 


PI 


4 


8137 


97.8 


1503 


20 


Y27142 


Human slit-2 matur 


XX 


5 


8137 


97.8 


1503 


20 


W96701 


Slit-like protein 


DR 


6 


8065 


97.0 


1529 


21 


Y76117 


Rat Slit homologue 


DR 


7 


5717.5 


68.8 


1523 


21 


Y99395 


Human PR01336 (CNQ 


XX 


8 


5714,5 


68.7 


1523 


20 


Y27146 


Human slit-3 prote 


PT 


• 9 


5714.5 


68.7 


1523 


20 


Y04137 


Human slit 3 prote 


XX 


10 


5702.5 


68.6 


1496 


20 


Y27143 


Human slit-3 matur 


PS 


11 


5702.5 


68.6 


1496 


20 


Y04136 


Human slit 3 matur 


XX 


12 


5611.5 


67.5 


1523 


20 


Y14142 


Human Slit protein 


cc 



13 


5597 


67.3 


1534 


19 


W46966 


Amino acid sequenc 


14 


5589 


67.2 


1534 


20 


Y27144 


Human slit-1 prote 


15 


5589 


67,2 


1534 


20 


Y04139 


Human slit 1 prote 


16 


5589 


67.2 


1534 


20 


W96707 


Protein sequence o 


17 


5588 


67.2 


1508 


20 


Y27141 


Human slit-1 matur 


18 


5588 


67.2 


1508 


20 


Y04138 


Human slit 1 matur 


19 


5588 


67,2 


1508 


20 


W96706 


Protein sequence o 


20 


3543,5 


42.6 


716 


21 


Y76005 


Rat Slit homologue 


21 


3443 


41,4 


1480 


13 


R25079 


Drosophila SLIT pr 


22 


1668 


20.1 


299 


20 


Y27149 


Protien encoded by 


23 


1668 


20.1 


299 


20 


W96703 


EST clone protein 


24 


1429.5 


17.2 


450 


20 


Y27156 


Peptide Seq ID No: 


25 


766 


9.2 


1010 


20 


W87896 


Human JAGGED1 solu 


26 


766 


9.2 


1187 


18 


W18352 


Proliferation and 


27 


766 


9,2 


1218 


17 


W05833 


Human Serrate -1 (H 


28 


766 


9.2 


1218 


19 


W44301 


Human serrate 1, 


29 


766 


9.2 


1218 


20 


W87894 


Human JAGGEDl prot 


30 


766 


9.2 


1218 


21 


Y59597 


Human Serrate prot 


31 


761 


9,2 


1036 


18 


W18351 


Proliferation and 


32 


761 


9.2 


1218 


18 


W18354 


Proliferation and 


33 


759 


9.1 


1208 


19 


W40827 


Human Jagged prote 


34 


755 


9,1 


228 


19 


W46967 


Amino acid sequenc 


35 


755 


9.1 


228 


20 


Y27147 


Mouse slit protein 


36 


737,5 


8.9 


2471 


20 


Y06816 


Human Notch2 (humN 


37 


736.5 


8.9 


1193 


17 


W05835 


Chick Serrate, Ga 


38 


736is • 


U 


1193 


21 


Y59599 


Chick Serrate prot 


39 


719.5 


8.7 


2321 


19 


W49698 


Human Notch3 prote 


40 


697 


,8.4 


1872 


19 


W68510 ' 


Partial human Note 


41 


691.5 


8.3 


1964 


20 


W95557 


Mus musculus notch 


42 


685 


8.2 


1148 


20 


W87895 


Human JAGGED2 prot 


43 


680.5 


8.2 


1055 


19 


W44298 


Human serrate 2 pr 


44 


680.5 


8.2 


1212 


19 


W44299 


Human serrate 2. 


45 


680.5 


8.2 


1257 


17 


W05834 


Human Serrate-2 (H 



RESULT 1 
Y17499 

ID Y17499 standard; Protein; 1525 AA. 
XX 

AC Y17499; 
XX 

DI 04-AUG-1999 (first entry) ' 
XX 

DE Human Slit-1 protein. 
XX 

KW Human; Slit-1; Robo; modulation; identification; interaction, 
XX 

OS Homo sapiens. 

XX 

PN W09925831-A2. 
XX 

PD 27-MAY-1999. . 
XX 

PF 13-NOV-1998; 98WO-US24245. 
XX 

07-APR-1998; 98US-0081057, 
14-NOV-1997; 97OS-0065544. 



Brose K, Goodman C, Kid T, Tessier-Lavigne M; 



WPI; 1999-347475/29. 
N-PSDB; X76161. 



Hunan Slit polypeptide and related nucleic acids 
Disclosure; Page 19-21; 34pp; English. 



CC The present sequence is the human slit-1 protein. The present invention 
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CC also describes a method for identifying agents which modulate the 

CC interaction of Robo and a Robo ligand comprising: combining a Robo 

CC polypeptide, a Slit polypeptide and a candidate agent under conditions 

CC where the Robo and Slit polypeptides normally (but for the presence of 

CC the agent) engage in a first interaction, where the Slit polypeptide 

CC specifically binds, activated or inhibits the activation of the Robo 

CC polypeptide and determining a second interaction of the Robo and Slit 

CC polypeptides in the presence of the agent, where a difference between 

CC the first and second interactions indicates that the agent modulates the 

CC interaction of the Robo and Slit polypeptides; and a method to modulate 

CC the interaction of Robo and a Robo ligand. The method is useful for 

CC screening for Robo (roundabout) modulators and Robo:Slit complexes are 

CC useful for regulating various cell functions, especially of neuronal 

CC cells. ' 
XX i 

SQ Sequence 1525 AA; ' 

I 

I 

Query Match 100.0%; Score 8316; DB 20; Length 1525; 



^ Best Local Similarity 100.01; Pred. No. 0; 


^^Ma 


ches 


1525; Conservative i 0; Mismatches 0; Indels 0; Gaps 


Qy 

Db 


1 
1 


MRGVGWQMLSLSLGLVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 
rargvgwqmlslslglvlailnkvapqacpaqcscsgstvdchglalrsvprniprnterl 60 


Qy 


61 


DLNGNNITRITKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPE 120 


Db 


61 


dlngnnitritktdfagirhirvlqlmenkistiergafqdlkeierirlnrnhlqlfpe 120 


Qy 


121 


LLFLGTAKLYRLDLSEHQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 


Db 


121 


llflgtaklyrldlsenqiqaiprkafrgavdiknlqldynqisciedgafralrdlevl 180 


Qy 


181 


TLNNNNITRLSVASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLnQCMGPS 240 


Db 


181 


tlnnnnitrlsvasfnhmpklrtfrlhsnnlycdchlawlsdwlrkrprvglytqcmgps 240 


Qy 


241 


HLRGHNVAEVQKREFVCSDEEEGHQSFMAPSCSVLHCPAACTCSNNIVDCRGKGLIEIPT 300 


Db 


241 


hlrghnvaevqkrefvcsdeeeghqsfmapscsvihcpaactcsnnivdcrgkgiteipt 300 


Qy 


301 


NLPETIIEIRLEONTIKVIPPGAFSPYKKLRRIDLSNNQISELAPDAFOGLRSLNSLVLY 360 


Db 


301 


nlpetiteirleqntikvippgafspykklrridlsnnqiselapdafqglrslnslvly 360 


Qy 


361 


GNKITELPKSLFEGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNRLQTIAKGTF 420 


t 


361 


gnkitelpkslfeglfslqllllnankinclrvdafqdlhnlnllslydnklqtiakgtf 420 




421 


SPLRAIOTMHLAQNPF ICDCHLKWLADY LHT NP I ETSGARCTSPRRLANKRIGQIKSKKF 480 


Db 


421 


splraiqtmhlaqnpf icdchlkwiadylhtnpietsgarctsprrlankrigqikskkf 480 


Qy 


481 


RCSGTEDYRSRLSGDCFADLACPERCRCEGITVDCSNQKLNKIPEHIPQYTAELRLNNNE 540 


Db 


481 


rcsgtedyrsklsgdcfadlacpekcrcegttvdcsnqklnkipehipqytaelrlnnne 540 


Qy 


541 


FTVLEATGIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKG 600 


Db 


541 


ftvleatgifkklpqlrkinfsnnkitdieegafegasgvneilltsnrlenvqhknfkg 600 


•Qy 


601 


LESLKTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNOnTVAPGAFDTLHSLSTLNLLAN 660 


Db 


601 


leslktMrsnritcvgndsfiglssvrllslydnqittvapgafdtlhslstlnllan 660 


Qy 


661 


PFNCNCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPL 720 


Db 


661 


pfncncylawlgewlrkkrivtgnprcqkpyflkeipiqdvaiqdftcddgnddnscspl 720 


Qy 


721 


SRCPTECTCLDTWRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLIDL 780 


Db 


721 


srcptectcldtvvrcsnkglkvlpkgiprdvtelyldgnqftlvpkelsnykhltlidl 780 



Qy 


781 SNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGA 


840 


Db 


781 snnristlsnqsfsnmtqlltlilsynrlrcipprtfdglkslrllslhgndisvvpega 


840 


Qy 


841 FNDLSALSHLAIGANPLYCDCNMQWLSDWVKSEYREPGIARCAGPGEMADKLLLTTPSKK 


900 


Db 


841 fndlsalshlaiganplycdcnmqwlsdwvkseykepgiarcagpgemadklllttpskk 


900 


Qy 


901 FTCQGPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISN 


960 


Db 


901 ftcqgpvdvnilakcnpclsnpckndgtcnsdpvdfyrctcpygfkgqdcdvpihacisn 


960 


Qy 


961 PCKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLC 


1020 


Db 


961 pckhggtchlkegeedgfwcicadgfegencevnvddcedndcennstcvdginnytclc 


1020 


Qy 


1021 PPEYTGELCEKKLDFCAQDLNPCQHDSKCILIPKGFKCDCTPGYVGEHCDIDFDDCQDNK 


1080 


Db 


1021 ppeytgelceekldfcaqdlnpcqhdskciltpkgfkcdctpgyvgehcdidfddcqdnk 


1080 


Qy 


1081 CRNGAHCTDAWGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPI 


1140 


Db 


1081 ckngahctdavngytcicpegysglfcefsppmvlprtspcdnfdcqngaqcivrinepi 


1140 


Qy 


1141 CQCLPGYQGEKCERLVSVNFINRESYLQIPSAKVRPQTNITLQIATDEDSGILLYRGDKD 


1200 


Db 


1141 cqclpgyqgekceklvsvnfinkesylqipsakvrpqtnitlqiatdedsgillykgdkd 


1200 


Qy 


1201 HIAVELYRGRVRASYDTGSHPASAIYSVETIKDGNFHIVELLALDQSLSLSVDGGNPKII 


1260 


Db 


1201 hiavelyrgrvrasydtgshpasaiysvetindgnfhivellaldqslslsvdggnpkii 


1260 


Qy 


1261 TNLSRQSTLNFDSPLYVGGMPGRSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVP 


1320 


Db 


1261 tnlskqstlnfdsplyvggmpgksnvaslrqapgqngtsfhgcirnlyinselqdfqkvp 


1320 


Qy 


1321 MQ1GILPGCEPCHRKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGT 


1380 


Db 


1321 mqtgilpgcepchkkvcahgtcqpssqagftcecqegwmgplcdqrtndpclgnkcvhgt 


1380 


Qy 


1381 CLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCRHGRCRLSGLGQPYCECSSGYTG 


1440 


Db 


1381 clpinafsysckcleghggvlcdeeedlfnpcqaikckhgkcrlsglgqpycecssgytg 


1440 


Qy 


1441 DSCDREISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSF 


1500 


Db 


1441 dscdreiscrgerirdyyqkqqgyaacqttkkvsr.lecrggcaggqccgplrskrrkysf 


1500 


Qy 


1501 ECTDGSSFVDEVERWRCGCTRCVS 1525 




Db 


1501 ectdgssfvdevekwkcgctrcvs 1525 





RESULT 2 
Y27145 

ID Y27145 standard; protein; 1529 AA. 
XX 

AC Y27145; 
XX 

DT 15-SEP-1999 (first entry) 
XX 

DE Human slit-2 protein (Seq ID No: 12 of JP11164690), 

XX 

KW Vertebrate-derived protein; slit protein; diagnosis; cancer; nerve; 

KW muscle; endocrine system, 

XX 

OS Horao sapiens, 1 
XX 

PN JP11164690-A. ' 
XX 

PD 22-J0N-1999. 
XX 
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PF 05-DEC-1997; 
XX 

PR 05-DEC-1997; 



97JP-0335435. 
97JP-0335435. 



PA (AS AH ) ASAHI KASEI KOGYO RK, 
XX 

DR WPI; 1999-411830/35. 

DR N-PSDB; X89162. 

xx ; 

PT New vertebrate slit protein - useful for diagnosis and treatment of 

PT cancers in nerves, muscle and endocrine system 

XX j 

PS Disclosure; Page 58-63; 102pp; Japanese. 

XX 

, CC The invention relates to a vertebrate-derived protein containing an amino 

ice acid sequence show* in Y27137 and Y27139. The vertebrate-derived protein 

{CC has at least 55 % imology to one of sequences shown in Y27141-Y27143, 

"CC and has slit protel-like activity. The vertebrate slit proteins encoding 

^ nucleic acid seque*es have at least 60% homology to nucleic acid 
sequences X89161-16B. The vertebrate-derived proteins can be produced 
recombinant^ by transforming host cells with expression vectors 

CC comprising the encoping nucleic acids. The proteins of the invention are 

CC for diagnosing and peating cancer of the nerves, muscle and/or endocrine 

CC system. • 
xx \ 

« SQ Sequence 1529 AA;|' 



Query Match 1 99.4%; Score 8265; DB 20; Length 1529; 

Best Local Similarity 99.2%; Pred. No. 0; 

Matches 1520; Conservative 1; Mismatches 0; Indels 12; Gaps 2; 

Qy 1 MRGVGWQMLSLSLGLVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 

Db 1 mrgvgwqmlslslglvlailnkvapqacpaqcscsgstvdchglalrsvprniprnterl 60 

Qy 61 DLNGNNITRITKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHIiQLFPE 120 

^ Db 61 dlngnnitritktdfaglrhlrvlqlmenkistiergafqdlkelerlrlnrnhlqlfpe 120 

Qy 121 LLFWTMLYRLDLSENQIQAIPRKAFRGAVDIKNliQLDYNQISCIEDGAFRALRDLEVL 180 

Db 121 llflgtaklyrldlsenqiqaiprkafrgavdiknlqldynqisciedgafralrdlevl 180 

Qy 181 TLNNNNITRLSVAStNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPS 240 

Db 181 tlnnnnitrlsvasfnhmpklrtfrlhsdnlycdchlawlsdwlrqrprvglytqcmgps 240 

H ,241 HLRGHNVAEVQKREFVCSDEEEGHQSFMAPSCSVLHCPAACTCSNNIVDCRGKGLTEIPT 300 

241 hlrghnvaevqRTef ves - - - -ghqsfmapscsvlhcpaactcsnnivdcrgkglteipt 296 

301 NLPETITEIRLEQNTIKVIPPGAFSPYRKLRRIDLSNNQISELAPDAFQGLRSLNSLVLY 360 

297 nlpetiteirleqntikvippgafspykklrridlsnnqiselapdafqglrslnslvly 356 

361 GNKITELPKSLFEGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTF 420 

357 gnkitelpkslfeglfslqllllnankinclrvdafqdlhnlnllslydnklqtiakgtf 416 

421 SPLRAIQTMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKKF 480 



Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 417 splraiqtmhlaqnpficdchlkwladylhtnpietsgarctsprrlankrigqikskkf 476 

Qy 481 RCS GTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNRIPEHIPQYTA 532 

Db 477 rcsakeqyfipgtedyrsklsgdcfadlacpekcrcegttydcsnqklnkipehipqyta 536 

Qy 533 ELRLNNNEFTVLEATG1FRKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLEN 592 

Db 537 elrlnnneftvleatgifkklpqlrkinfsnnkitdieegafegasgvneilltsnrlen 596 

Qy 593 VQHRMFKGLESLKTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSL 652 



Db 597 vqhkmfkgleslktlmlrsnritcvgndsfiglssvrllslydnqittvapgafdtlhsl 656 
Qy 653 STLNLLANPFNCNCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGN 712 
Db 557 stlnllanpfncncylawlgewlrkkrivtgnprcqkpyflkeipiqdvaiqdftcddgn 716 
Qy 713 DDNSCSPLSRCPTECTCLDTWRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNY 772 
Db 717 ddnscsplsrcptectcldtvvrcsnkglkvlpkgiprdvtelyldgnqftlvpkelsny 775 
Qy 773 KHLTLIDLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGND 832 
Db 777 khltlidlsnnristlsnqsfsnmtqlltlilsynrlrcipprtfdglkslrllslhgnd 836 
Qy 833 ISWPEGAFNDLSALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKL 892 
Db 837 iswpegafndlsalshlaiganplycdcnmqwlsdwvkseykepgiarcagpgemadkl 395 
Qy 893 LLTTPSKKFTCQGPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDV 952 
Db 897 llttpskkftcqgpvdvnilakcnpclsnpckndgtcnsdpvdfyrctcpygfkgqdcdv 956 
Qy 953 P I HAC ISNPCKHGGT CHLKEGEEDGFWC IC ADGFEGENCEVNVDDCEDNDCENNSTCVDG 1012 
Db 957 pihacisnpckhggtchikegeedgfwcicadgfegencevnvddcedndcennstcvdg 1016 

Qy 1013 INNYTCLCPPEYTGELCEEKLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDID 1072 

Db 1017 innytclcppeytgelceekldfcaqdlnpcqhdskciltpkgfkcdctpgyvgehcdid 1076 

Qy 1073 FDDCQDNRCRNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQC 1132 

Db 1077 fddcqdnkckngahctdavngytcicpegysglfcefsppmvlprtspcdnfdcqngaqc 1136 

Qy 1133 IVRINEPICQCLPGYQGERCERLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGI 1192 

Db 1137 ivrinepicqclpgyqgekceklvsvnfinkesylqipsakvrpqtnitlqiatdedsgi 1196 

Qy 1193 LLYKGDKDHIAVELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSV 1252 

Db 1197 llykgdkdhiavelyrgrvrasydtgshpasaiysvetindgnfhivellaldqslslsv 1256 

Qy 1253 DGGNPKIITNLSKQSTLNFDSPLYVGGMPGRSNVASLRQAPGQNGTSFHGCIRNLYINSE 1312 

Db 1257 dggnpkiitnlskqstlnfdsplyvggmpgksnvaslrqapgqngtsfhgcirnlyinse 1316 

Qy 1313 LQDFQKVPMQTGILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCL 1372 

Db 1317 lqdfqkvpmqtgilpgcepchkkvcahgtcqpssqagftcecqegwmgplcdqrtndpcl 1376 

Qy 1373 GNKCVHGTCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYC 1432 

Db 1377 gnkcvhgtclpinafsysckcleghggvlcdeeedlfnpcqaikckhgkcrlsglgqpyc 1436 

Qy 1433 ECSSGYTGDSCDREISCRGERIRDYYQRQQGYAACQTTKRVSRLECRGGCAGGQCCGPLR 1492 

Db 1437 ecssgytgdscdreiscrgerirdyyqkqqgyaacqttkkvsrlecrggcaggqccgplr 1496 

Qy 1493 SKRRKYSFECTDGSSFVDEVEKWRCGCTRCVS 1525 

Db 1497 skrrkysfectdgssfvdevekvvkcgctrcvs 1529 



RESULT 3 
W96702 

ID W96702 standard; protein; 1529 AA. 
XX , 
AC. W96702; 
XX 

DT 15-APR-1999 (first entry) 
XX 

DE Full length slit-like protein sequence. 



1 
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Slit-like polypeptide; diagnosis; treatment; nervous disease; 
thyroid disease; adrenal disease; muscular disease. 



Homo sapiens. 



Key 

Peptide 



Location/Qualifiers 
1..26 

/note- "signal peptide" 
27.. 1529 

/note- "mature protein, claimed in Claim 1" 



JP11018777-A. 

26-JAN-1999. 
I 

09-JOL-199I; 97JP-0183683 . 

09-JUL-1997; 97 JP-0183683 . 

(ASAH ) ASAHI RASEI KOGYO KR. 

WPI; 1999-161084/14. 
N-PSDB; X14979. 



New slit- like polypeptide - useful for diagnosis and treatment of 
nervous, thyroid, adrenal and muscular diseases 

Example 5; .Page 27-31; 50pp; Japanese. 

The present sequence represents thj full length sequence of a 
slit- like polypeptide. The polypeptide is useful for the 
diagnosis and the treatment of all 'nervous diseases, thyroid 
diseases, adrenal diseases and muscular diseases. 



Sequence 



Query Match 



1529 AA; 



Best Local Similarity 99.2%; Pred, 



99.44; Scorej8265; DB 20; Length 1529; 



Matches 


Oy 


1 


Db 


1 


oy 


61 


I 


61 




121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


1241 


oy 


301 


Db 


297 


Qy 


361 


Db 


357 


Qy 


421 


Db 


417 



Qy 


481 


RCS "GTEDYRSRLSGDCFADlACPERCRCEGTTVDCSNQKLNKIPEHIPQYTA 


532 


Db 


477 


rcsakeqyfipgtedyrsklsgdcfadlacpekcrcegttvdcsnqklnkipehipqyta 


536 


Qy 


533 


ELRLNWEFTVLEATGIFKKLPQLRKINFSNNRITDIEEGAFEGASGVNEILLTSNRLEN 


592 


Db 


537 


elrlnnneftvleatgifkklpqlrkinfsnnkitdieegafegasgvneilltsnrlen 


596 


Qy 


593 


VQHKMFKGLESLKTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSL 


652 


Db 


597 


vqhkmfkgleslktlralrsnritcvgndsfiglssvrllslydnqittvapgafdtlhsl 


656 


Qy 


653 


STLNLLANPFHCNCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGN 


712 


Db 


657 


stinilanpfncncylawlgewlrkkrivtgnprcqkpyhkeipiqdvaiqdftcddgn 


716 


Qy 


713 


DDNSCSPLSRCPTECTCLDTWRCSNKGLRVLPKGIPRDVTELYLDGNQFTLVPKELSNY 


772 


Db 


717 


ddnscsplsrcptectcldtwrcsnkglkvlpkgiprdvtelyldgnqftlvpkelsny 


776 


Qy 


773 


KHLTLIDLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGND 


832 


Db 


777 


khltlidlsnnristlsnqsfsnmtqlltlilsynrlrcipprtfdglkslrllslhgnd 


836 


Qy 


833 


ISWPEGAFNDLSALSHLAIGANPLYCDCNMQWLSDWVRSEYKEPGIARCAGPGEMADKL 


892 


Db 


837 


iswpegafndlsalshlaiganplycdcnmqwlsdwvkseykepgiarcagpgemadkl 


896 


Qy 


893 


LLTTPSKKFTCQGPVDVNILAKCNPCLSNPCKNDGICNSDPVDFYRCTCPYGFKGQDCDV 


952 


Db 


897 


llttpskkftcqgpvdvnilakcnpclsnpckndgtcnsdpvdfyrctcpygfkgqdcdv 


956 


Qy 


953 


PIHACISNPCKHGGTCHLREGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSICVDG 


1012 


Db 


957 


pihacisnpckhggtchlkegeedgfwcicadgfegencevnvddcedndcennstcvdg 


1015 


Qy 


1013 


INNYTCLCPPEYTGELCEEKLDFCAQDLNPCQHDSRCILTPKGFRCDCTPGYVGEHCDID 


1072 


Db 


1017 


innytclcppeytgelceekldfcaqdlnpcqhdskciltpkgfkcdctpgyvgehcdid 


1076 


Qy 


1073 


FDDCQDNRCRNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQC 


1132 


Db 


1077 


fddcqdnkckngahctdavngytcicpegysglfcefsppmvlprtspcdnfdcqngaqc 


1136 


Qy 


1133 


IVRINEPICQCLPGYQGERCERLVSVNFINRESYLQIPSARVRPQTNITLQIATDEDSGI 


1192 


Db 


1137 


ivrinepicqclpgyqgekceklvsvnfinkesylqipsakvrpqtnitlqiatdedsgi 


1196 


Qy 


1193 


LLYKGDRDHIAVELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSV 


1252 


Db 


1197 


llykgdkdhiavelyrgrvrasydtgshpasaiysvetindgnfnivellaldqslslsv 


1256 


Qy 


1253 


DGGNPKIITNLSRQSTLNFDSPLYVGGMPGRSNVASLRQAPGQNGTSFHGCIRNLYINSE 


1312 


Db 


1257 


dggnpkiitnlskqstlnfdsplyvggmpgksnvaslrqapgqngtsfhgcirnlyinse 


1316 


Qy 


1313 


LQDFQKVPMQTGILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCL 


1372 


Db 


1317 


lqdfqkvpmqtgilpgcepchkkvcahgtcqpssqagftcecqegwmgplcdqrtndpcl 


1376 


Qy 


1373 


GNRCVHGTCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIRCRHGKCRLSGLGQPYC 


1432 


Db 


1377 


gnkcvhgtclp'inafsysckcleghggvlcdeeedlfnpcqaikckhgkcrlsglgqpyc 


1436 


Qy 


1433 


ECSSGYTGDSCDREISCRGERIRDYYQKQQGYAACQTTKRVSRLECRGGCAGGQCCGPLR 


1492 


Db 


1437 


ecssgytgdscdreiscrgerirdyyqkqqgyaacqttkkvsrlecrggcaggqccgplr 


1496 


Qy 


1493 


SRRRKYSFECTDGSSFVDEVERWRCGCTRCVS 1525 




Db 


1497 


skrrkysfectdgssfvdevekwkcgctrcvs 1529 
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RESULT 4 
Y27142 

ID Y27142 standard; protein; 1503 AA. 

AC Y27142; 
XX 

DT 15-SEP-1999 (first entry) j 
XX j 

DE Human slit-2 mature protein (Seq ID No: 6 of JP1U64690). 

xx : 

KW Vertebrate-derived protein; slit protein; diagnosis; cancer; nerve; 
KW muscle; endocrine system, 



OS Homo sapiens. 

PN JP11164690-A. 
XX 



J2-JUN-1999. 
05-DEC-1997; 9f\?P-0335435 . 



XX 

PR 05-DEC-1997; 97 JP-0335435 . 
XX 

PA (ASAH ) ASAHI KASEI KOGYO KK. 
XX 

DR WPI; 1999-411830/35. 

DR N-PSDB; X89162. 
XX 

PT New vertebrate slit protein • useful for diagnosis and treatment of 

PT cancers in nerves, muscle and endocrine system 

XX 

PS Claim 5; Page 43-47; 102pp; Japanese. 

XX ; 

CC The invention relates to a vertebrate-derived protein containing an amino 

CC acid sequence shown in Y27137 and Y27139. The vertebrate-derived protein 

CC has at least 55 % homology to one of sequences shown in Y27141-Y27143, 

CC and has slit protein-like activity. The vertebrate slit proteins encoding 

CC nucleic acid sequences have at least 60* homology to nucleic acid 

CC sequences X89161-163. The vertebrate-derived proteins can be produced 

CC recombinantly by transforming host cells with expression vectors 

CC comprising the encoding nucleic acids, The proteins of the invention are 

CC for diagnosing and treating cancer of the nerves, muscle and/or endocrine 

CC system, 
xx 

SO Sequence 1503 AA; 



■Query Match 97,8%; Score 8137; DB 20; Length 1503; 

r Best Local Similarity 99,11; Pred. No. 0; 
Matches 1494; Conservative 1; Mismatches 0; Indels 12; Gaps 2; 



Qy 



27 ACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITRITKTDFAGLRHLRVLQL 86 
1 acpaqcscsgstvdchglalrsvprniprnterldlngnnitritktdfaglrhlrvlql 60 
87 MENKISTIERGAfQDLKELERLRLNRNHLQLFPELLFLGTAKLYRLDLSENQIQAIPRKA 146 
61 menkistiergafqdlkelerlrlnrnhlqlfpellflgtaklyrldlsenqiqaiprka 120 
147 FRGAVDIRNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVASFNHMPRLRTFRL 206 

121 frgavdiknlqldynqisciedgafralrdlevltlnnnnitrlsvasfnhmpklrtfrl 180 

207 HSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPSHLRGHNVAEVQKREFVCSDEEEGHQS 266 
181 hsnnlycdchlawlsdwlrqrprvglytqcmgpshlrghnvaevqkrefvcs----ghqs 236 
267 FMAPSCSVLHCPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLEQNTIKVIPPGAFSP 326 

Db 237 fmapscsvlhcpaactcsnnivdcrgkglteiptnlpetiteirleqntikvippgafsp 296 
Qy 327 YKKLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNKITELPKSLFEGLFSLQLLLLNAN 386 



297 ykklrridlsnnqiselapdafqglrslnslvlygnkitelpkslfeglfslqllllnan 356 
387 KINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLAQNPFICDCHLKWLA 446 
357 kinclrvdafqdlhnlnllslydnklqtiakgtfsplraiqtmhlaqnpficdchlkwla 416 

447 DYLHTNPIETSGARCTSPRRLANKRIGQIKSKKFRCS GTEDYRSKLSGDCFA 498 

417 dylhtnplet'sgarctsprrlankrigqikskkfrcsakeqyfipgtedyrsklsgdcfa 476 
499 DLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVLEATGIFKKLPQLRK 558 
477 dlacpekcrcegttvdcsnqklnkipehipqytaelrlnnneftvleatgifkklpqlrk 536 
559 INFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKGLESLKTLMLRSNRITCVG 618 
537 infsnnkitdieegafegasgvneilltsnrlenvqhkmfkgleslktlmlrsnritcvg 596 
619 NDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCYLAWLGEWLRKK 678 
597 ndsfiglssvrllslydnqittvapgafdtlhslstlnllanpfncncylawlgewlrkk 656 
679 RIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCPTECTCLDTWRCSN 7 3 8 
657 rivtgnprcqkpyflkeipiqdvaiqdftcddgnddnscsplsrcptectcldtvvrcsn 716 
739 KGLPLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLIDLSNNRISTLSNQSFSNMTQ 798 
717 kglkvlpkgiprdvtelyldgnqftlvpkelsnykhltlidlsnnristlsnqsfsnmtq 776 
799 LLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGAFNDLSALSHLAIGANPLY 858 
777 lltlilsynrlrcipprtfdglkslrllslhgndisvvpegafndlsalshlaiganply 836 
859 CDCNMQWLSDWKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQGPVDVNILAKCNPC 918 
837 cdcnmqwlsd'rfvkseykepgiarcagpgemadklllttpskkftcqgpvdvnilakcnpc 896 
919 LSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGTCHLKEGEEDGF 978 

897 lsnpckndgtcnsdpvdfyrctcpygfkgqdcdvpihacisnpckhggtchlkegeedgf 956 
979 WCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFCAQ 1038 
957 wcicadgfegencevnvddcedndcennstcvdginnytclcppeytgelceekldfcaq 1016 
1039 DLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCIC 1098 
1017 dlnpcqhdskciltpkgfkcdctpgyvgehcdidfddcqdnkckngahctdavngytcic 1076 
1099 PEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGYQGEKCEKLVSV 1158 
1077 pegysglfcefsppmvlprtspcdnfdcqngaqcivrinepicqclpgyqgekceklvsv 1136 
1159 NFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELYRGRVRASYDTG 1218 
1137 nfinkesylqipsakvrpqtnitlqiatdedsgillykgdkdhiavelyrgrvrasydtg 1196 
1219 SHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLSKQSTLNFDSPLYVG 1278 
1197 shpasaiysvetindgnfhivellaldqslslsvdggnpkiitnlskqstlnfdsplyvg 1256 
1279 GMPGKSNVASLRQAPGQNGTSFHGCIREYINSELQDFQKVPMQTGILPGCEPCHKKVCA 1338 
1257 gmpgksnvaslrqapgqngtsfhgcirnlyinselqdfqkvpmqtgilpgcepchkkvca 1316 
1339 HGTCQPSSQA3FTCECQEGWMGPLCDQRTNDPCLGNKCVHGTCLPINAFSYSCKCLEGHG 1398 

1317 hgtcqpssqagftcecqegragplcdqrtndpclgnkcvhgtclpinafsysckcleghg 1376 

1399 GVLCDEEEDLFNPCQAIKCRHGKCRLSGLGQPYCECSSGYTGDSCDREISCRGERIRDYY 1458 
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Db 1377 gvlcdeeedlfnpcqaikckhgkcrlsglgqpycecssgytgdscdreiscrgerirdyy 1436 

Qy 1459 QKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSFECTDGSSF7DEVEKVVKC 1518 

Db 1437 qkqqgyaacqttkkvsrlecrggcaggqccgplrskrrkysfectdgssfvdevekwkc 1496 

Qy 1519 GCTRCVS 1525 

Db 1497 gctrcvs 1503 



RESULT 5 
W96701 

ID W96701 standard; protein; 1503 AA. 

XX 

AC W96701; 
XX 

■ DT 15-APR-1999 (first entry) 
XX 

^>E Slit- like protein amino acid sequence, 

Prrf Slit-like polypeptide; diagnosis; treatment; nervous disease; 

KW thyroid disease; adrenal disease; muscular disease, 
xx 

OS Homo sapiens, 
XX 

PN JP11018777-A. 
XX 

PD 26-JAN-1999, 
XX 

PF 09-JOT-1997; 



97JP-0183683. 
97JP-0183683. 



PR 09-JOL-1997; 
xx • 

PA (ASAH ) ASAHI KASEI KOGYO KK, 
XX 

DR WPI; 1999-161084/14. 
XX 

pt New slit-like polypeptide - useful for diagnosis and treatment of 
PT nervous, thyroid, adrenal and muscular diseases 

XX 

PS Claim 1; Page 17-21; 50pp; Japanese. 
XX 

CC The present sequence represents the mature protein sequence of 

CC a slit-like polypeptide. The polypeptide is useful for the 

CC diagnosis and the treatment of all nervous diseases, thyroid 
CC . diseases, adrenal diseases and muscular diseases, 
XX 

ftp Sequence 1503 AA; 



Query Match 97.8*; Score 8137; DB 20; Length 1503; 

Best Local Similarity 99.1%; Pred. No. 0; 

Matches 1494; Conservative 1; Mismatches 0; Indels 12; Gaps 

Qy 27 ACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITRITKTDFAGLRHLRVLQL 86 

Db 1 acpaqcscsgstvdchglalrsvprniprnterldlngnnitritktdfaglrhlrvlql 60 

Qy 87 MENKISTIERGAFQDLKELERLRLNRNHLQLFPELLFLGTAKLYRLDLSENQIQAIPRKA 146 

Db 61 menkistiergafqdlkelerlrlnrnhlqlfpellflgtaklyrldlsenqiqaiprka 120 

Qy 147 FRGAVDIRNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVASFNHMPRLRTFRL 206 

Db 121 frgavdiknlqldynqisciedgafralrdlevltlnnnnitrlsvasfnhmpklrtfrl 180 

Qy 207 HSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPSHLRGHNVAEVQKREFVCSDEEEGHQS 266 

Db 181 hsnnlycdchlawlsdwlrqrprvglytqcmgpshlrghnvaevqkrefvcs — ghqs 236 

Qy 267 FMAPSCSVLHCPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLEQNTIKVIPPGAFSP 326 



Db 237 fmapscsvlhcpaactcsnnivdcrgkglteiptnlpetiteirleqntikvippgafsp 296 

Qy 327 YKKLRRIDLSHNQISELAPDAFQGLRSLNSLVLYGNKITELPKSLFEGLFSLQLLLLNAN 386 

Db 297 ykklrridlsnnqiselapdafqglrslnslvlygnkitelpkslfeglfslqllllnan 356 

Qy 387 KINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLAQNPFICDCHLKWLA 446 

Db 357 kinclrvdafqdlhnlnllslydnklqtiakgtfsplraiqtmhlaqnpficdchlkwla 416 

Qy 447 DYLHT NP IETSG ARCTS PRRLANKRIGQ IKSKKFRCS GTEDYRSKLSGDCFA 498 

Db 417 dylhtnpietsgarctsprrlankrigqikskkfrcsakeqyfipgtedyrsklsgdcfa 476 

Qy 499 DLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVLEATGIFKKLPQLRK 558 

Db 477 dlacpekcrcegttvdcsnqklnkipehipqytaelrlnnneftvleatgifkklpqlrk 536 

Qy 559 INFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKGLESLKTLMLRSNRITCVG 618 

Db 537 infsnnkitdieegafegasgvneilltsnrlenvqhkmfkgleslktlmlrsnritcvg 596 

Qy 619 NDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCYLAWLGEWLRKK 678 

Db 597 ndsfiglssvrllslydnqittvapgafdtlhslstlnllanpfncncylawlgewlrkk 656 

Qy 679 RIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCPTECTCLDTWRCSN 738 

Db 657 rivtgnprcqkpyf lkeipiqdvaiqdf tcddgnddnscsplsrcptectcldtvvrcsn 716 

Qy 739 KGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLIDLSNNRISTLSNQSFSNMTQ 798 

Db 717 kglkvlpkgiprdvtelyldgnqf tlvpkelsnykhltl idlsnnr istlsnqs f snmtq 776 

Qy 799 LLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGAFNDLSALSHLAIGANPLY 858 

Db 777 lltlilsynrlrcipprtfdglkslrllslhgndisvvpegafndlsalshlaiganply 836 

Qy 859 CDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQGPVDVNILAKCNPC 918 

Db 837 cdcnmqwlsdwvkseykepgiarcagpgemadklllttpskkftcqgpvdvnilakcnpc 896 

Qy 919 LSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGTCHLKEGEEDGF 978 

Db 897 lsnpckndgtcnsdpvdfyrctcpygfkgqdcdvpihacisnpckhggtchlkegeedgf 956 

Qy 979 WCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFCAQ 1038 

Db 957 wcicadgfegencevnvddcedndcennstcvdginnytclcppeytgelceekldfcaq 1016 

Qy 1039 DLNPCQHDSKCILTPKGFRCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCIC 1098 

Db 1017 dlnpcqhdskciltpkgfkcdctpgyvgehcdidfddcqdnkckngahctdavngytcic 1076 

Qy 1099 PEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGYQGEKCEKLVSV 1158 

Db 1077 pegysglfcefsppmvlprtspcdnfdcqngaqcivrinepicqclpgyqgekceklvsv 1136 

Qy 1159 NFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELYRGRVRASYDTG 1218 

Db 1137 nfinkesylqipsakvrpqtnitlqiatdedsgillykgdkdhiavelyrgrvrasydtg 1196 

Qy 1219 SHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLSKQSTLNFDSPLYVG 1278 

Db 1197 shpasaiysvetindgnfhivellaldqslslsvdggnpkiitnlskqstlnfdsplyvg 1256 

Qy 1279 GMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQTGILPGCEPCHKKVCA 1338 

Db 1257 ppgksnvasl.,rqapgqngtsfhgcirnlyinselqdfqkvpmqtgilpgcepchkkvca 1316 

Qy 1339 HGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNRCVHGTCLPINAFSYSCKCLEGHG 1398 

Db 1317 hgtcqpssqacftcecqegvmgplcdqrtndpclgnkcvhgtclpinafsysckcleghg 1376 



Best Available Copy 
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oy 


1399 


Db 


1377 


Qy 


1459 


Db 


1437 


Qy 


1519 


Db 


1497 



QKQQGYAACQTTKKVSRLECB JGCAGGQCCGPLRSKRRKYSFECTDGSSFVDEVEKWKC 1518 



llllllllllllllllllll 



lllllllllllllllilllliniillllllllllll! 
:i jgcaggqccgpl rskr rkys f ectdgs s f vdevekwkc 1496 



lllllll 



« 



RESULT 6 
Y76117 

ID Y76117 standard; Protein; 

XX 

Y76117; 



PN W09955865-A1. 
XX 

PD 04-NOV-1999. 
XX 

PF 



29-APR-1999; 99WO-NZ000. 



29-APR-1998; 
09-NOV-1998; 



98US-00697; i. 
98CJS-01889 I. 



.529 AA. 



5:396. 



27-MAR-2000 (first entry 
Rat Slit homologue, SEQ II 



Skin; dermal papilla; ken inocyte; neonatal foreskin fibroblast; 

embryonic skin cell; kera .nocyte stem cell; transit amplifying cell; 

secreted; transmembrane; : iflammation; cancer; neurological disease; 

angiogenesis; tumour vasct ,arisation; growth disorder; 

developmental disorder; s ,n wound; hair follicle disorder; 

anti- inflammatory; cytost ;ic; neuroprotective; vulnery. 

Rattus sp. 



(GENE-) GENESIS RES S DEV 



Strachan L, Sleeman M, ffitson JD, Onrust R, Rumble A, Murison JG; 



WPI; 2000-072177/06. 
N-PSDB; Z61825. 

Novel polynucleotides useful for the ^treatment of various conditions 
including wounds and cancer - | 

Claim 4; Page 223-226; 235pp; English. 

The invention relates to novel nucleic acid sequences derived from rat 
dermal papilla, human keratinocytes and neonatal foreskin fibroblasts, 
and mouse embryonic skin, keratinocyte stem cells and transit amplifying 
cells. Polypeptides of the invention may be used to treat inflammation, 
cancer and neurological diseases. The proteins may be used to stimulate 
the growth and motility of keratinocytes, to inhibit the growth of 
cancer cells, to modulate angiogenesis and tumour vascularisation, to 
modulate skin inflammation, to modulate epithelial cell growth and to 
inhibit binding of Hiv-l to leukocytes. The invention may also be used to 
treat growth and developmental defects, skin wounds and hair follicle 
disorders. Sequences Y75942-Y76123 represent polypeptides encoded 
by cDNA sequences derived from several mouse, rat or human skin cell 
types. Sequences Y75942-Y75947, Y76020-Y76021, Y76094 -Y76104 and 
Y76119 are proteins with an N-terminal signal sequence, indicating that 
they are secreted. Sequences Y75986 -Y75989 , Y7606H76071, Y76106 -Y76109 
and Y76121-Y76122 are proteins with one or more putative transmembrane 
domains. 

Sequence 1529 AA; 



Query Match ' 97.0%; Score 8065; DI 

Best Local Similarity 96.0»; Pred. No. 0; 

Conservative 29; Mismatches 



M 


atches 


Qy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


297 


Qy 


361 


Db 


357 


Qy 


421 


Db 


417 


Qy 


481 


Db 


477 


Qy 


533 


Db 


537 


Qy 


593 


Db 


597 


Qy 


653 


Db 


657 


Qy 


713 


Db 


717 


Qy 


773 


Db 


777 


Qy 


833 


Db 


837 


Qy 


893 


Db 


897 


Qy 


953 


Db 


957 



21; Length 1529; 

21; indels 12; Gaps 



iii mil 1 1! mi ii 1 1 mi iiinmim iii m ii mi niii 



niiiimmiiinmimnmiiiinmiiiiiimiiiimmii 



mmmnmnimniimiiininminiiiiiiimniiim 



iimiiiimmiiniiiimimniiiiimiinimniiinim 



iiiinimnnmi 



imiminiiin niininnmiiiin 

•-ghqsfmapscsvlhcpiactcsnnivdcrgkglteipt 



iiiniimiinmmmimiiiiimiimiiiiiimiiiiiiim 



imimiiinnnmiimiimiiinimmmiiiiiimmii 



i ( 1 1 r 1 1 r i i.i i ii 1 1 1 it 1 1 1 1 1 1 ) n 1 1 1 1 1 ii m i i 1 1 ii 1 1 ii 1 1 1 1 1 1 1 m i i 
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Baker R, Goddard A, Gumey AL, Smith V, Watanabe CK, Wood WI; 

WPI; 2000-237871/20. 
N-PSDB; A37077. 

New manunalian DNA sequences encoding transmembrane, receptor or 
secreted PRO polypeptides, useful for screening of potential peptide or 
small molecule inhibitors of the relevant receptor/ligand interactions 

Claim 12; Fig 112; 773pp; English. , 

A37022 to A37144 encode the new isolated human transmembrane, receptor 
or secreted PRO polypeptides given in Y99340 to Y99462. The 
transmembrane and receptor PRO proteins can be used for screening of 
potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interactions. The polypeptides and nucleotide sequences 
encoding then have various industrial applications, including uses as 
pharmaceutical and diagnostic agents. A37145 to A37330 represent 
PCR primers and hybridisation probes used in the isolation of the PRO 
polypeptides from the present invention. 
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Query Match 68.84; Score 5717.5; DB 21; Length 1523; 

Best Local Similarity 66.6%; Pred. No. le-301; 

i 1017; Conservative 217; Mismatches 277; Indels 15; Gaps 9; 



Qy 3 GVGWQMLS - LSLGLVLA- ILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 

III - 1:1 I II :h III :hll ::llllll l|:||| Mil III 
Db 7 gvgaavrarlalalalasvlsgppavacptkctcsaasvdchglglravprgiprnaerl 66 

Qy 61 DLNGNNITR1TKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPE 120 

II: llllllll lllll::IHI I :|::| 1 1 1 1 II 1 1 1 1 : 1 1 1 1 1 1 1 : 1 ||: 
Db 67 dldrnnitritkmdfaglknlrvlhlednqvsviergafqdlkqlerlrlnknklqvlpe 126 



Qy 121 LLFLGTARLYRLDLSENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 

. Ill I M JIIMIIIII MINIM 1:111111 I f 1 1 1 1 M 1 1 1 1 1 1 1 1 1 : ! 
Db 127 llfqstpkltrldlsenqiqgiprkafrgitdvknlqldnnhisciedgafralrdleil 186 

Qy 181 TLNNKNITRLS.VASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPS 240 

I I I I I I I : I : .1 HIIIIIHI IIIM:|IMIMIIIMIII:I II II I 
Db 187 tlnnnnisrilvtsfnhmpkirtlrlhsnhlycdchlawlsdwlrqrrtvgqftlcmapv 246 

Qy 241 HLRGHNVAEVQKREFVCSDEEEGHQSFMAPSCSV-LHCPAACTCSNNIVDCRGKGLTEI 298 

Mil f I ! : I.I 1 : 1 : 1 1 III: : II: IMIMIMIIMM II 

Db 247 hlrgfnvadvqkkeyvcpaphs eppscnansiscpspctcsnnivdcrgkglmei 301 

Qy 299 PTNLPETITEIRLEQNTIKVIPPGAFSPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLV 358 

I MM I llllllhll II III: i M 1 : 1 1 1 : t | M I : : I II I II 1 1 : 1 1 III 

Db 302 panlpegiveirleqnsikaipagaftqykklkridisknqisdiapdafqglksltslv 361 

Qy 359 LYGNKITELPRSLFEGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKG 418 

llllllll: I 11:1 IIMIMIIMMIMI: J 1 1 1 II 1 1 1 1 M 1 1 II I II : 1 1 
Db 362 lygnkiteiakglfdglvslqllllnankinclrvntfqdlqnlnllslydnklqtiskg 421 

Qy 419 TFSPLRAIQTMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSK 478 

l:||::|||::lllllll:|MIIIMIIII 1 1 1 II II I II : I II II M II I Mill 
Db 422 lfaplqsiqtlhlaqnpfvcdchlkwladylqdnpietsgarcssprrlankrisqiksk 481 

Qy 479 RFRCSGTEDYRSKLSGDCFADLACPERCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNN 538 

lllllhlllll: I Ml M IMUm llllllll :| |:|:| HHh 
Db 482 kfrcsgsedyrsrfssecfmdlvcpekcrcegtivdcsnqklvripshlpeyvtdlrlnd 541 

Qy 539 NEFTVLEATGIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMF 598 

II MIIIHMHM Mill Mill :: Mlhll: I ::|l MM I ::l 

Db 542 nevsvleatgifkklpnlrkinlsnnkikevregafdgaasvqelmltgnqletvhgrvf 601 

Qy 599 RGLESLKTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLL 658 

HI IIMIIII hi MM 1 1 1 1 1 1 1 1 1 1 1 ! 1 : 1 1 1 : MM II Mlhlll 
Db 602 rglsglktlmlrsnliscvsndtfaglssvrllslydnrittitpgafttlvslstinll 661 

Qy 659 ANPFNCNCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCS 718 

: 1 1 1 1 1 1 1 : M I |i : | M ! : I , | : ! 1 1 ! i i 1 1 : : ! I M 1 1 1 1 M I M 1 1 1 Ml:::; 
Db 662 snpfncnchlawlgkwlrkrrivsgnprcqkpfflkeipiqdvaiqdftc-dgneesscq 720 

Qy 719 PLSRCPTECTCLDTWRCSNRGLRVLPKGIPRDVTELYLDGNQFTLVPRELSNYRHLTLI 778 

Ml : 1 1 ! : : 1 1 1 i I H f 1 1 : IhhhllllllhM I hi' MUM 
Db 721 lsprcpeqctcmetvvrcsnkglralprgmpkdvtelylegnhltavprelsalrhltli 780 

Qy 779 DLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPE 838 

Hill II I'M Mllh I MMMIMIMI i : 1 1 : 1 1 1 : 1 : 1 1 1 1 i 1 1 Ml 
Db 781 dlsnnsismltnytfsnmshlstlilsynrlrcipvhafnglrslrvltlhgndissvpe 840 

Qy 839 GAFNDLSALSHLAIGANPLYCDCNMQWLSDWVKSEYREPGIARCAGPGEMADKLLLTTPS 898 

hlllhMIMhl 1 1 1 : 1 U : : : 1 1 1 : 1 1 1 : Mllllllh I MhlllMh 
Db 841 gsfndltslshlalgtnplhcdcslrwlsewvkagykepgiarcsspepmadrlllttpt 900 

Qy 899 RKFTCQGPVDVNILARCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACI 958 

M hllllHhllll MIMIIhlll llh Ml Ml : 1 1 : 1 1 Mh II 
Db 901 hrfqckgpvdlnivakcnaclsspcknngtctqdpvelyrcacpysykgkdctvpintci 960 

Qy 959 SNPCRHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTC 1018 

Mhlllllll : Mil I I Mlh MM 1 1 1 1 1 1 1 1 1 1 h I M 1 1 1 1 1 I 
Db 961 qnpcqhggtchlsdshkdgfscscplgfegqrceinpddcedndcennatcvdginnyvc 1020 

Qy 1019 LCPPEYTGELCEERLDFCAQDLNPCQHDSRCILTPRGFRCDCTPGYVGEHCDIDFDDCQD 1078 

hi HUM: M I Ml 1 1 1 :: 1 1 1 III hi III h h I Ml 
Db 1021 icppnytgelcdevidhcvpelnlcqheakcipldkgfscecvpgysgklcetdnddcva 1080 

Qy 1079 a NRCRNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINE 1138 
v '• ; "":lhH! I I MUM 1 1 : 1 : 1 1 III Mill MIMI ::MIIIIIII I 
Db 1081 hkcrhgaqcvdtingytctcpqgfsgpfcehpppmvllqtspcdqyecqngaqciwqqe 1140 

Qy 1139 PICQCLPGYQGERCERLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYRGD 1198 

I IM lh I : 1 1 1 1 : : 1 1 1 : IMh:: lllllll 1 1 : 1 1 : 1 1 1 : 1 : 1 1 1 1 1 1 1 1 
Db 1141 ptcrcppgfag'prceklitvnfvgkdsyvelasakvrpqanislqvatdkdngillykgd 1200 
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Qy 1199 KDHIAVELYRGRVPASYDTGSHPASAIYSVETINDGNFHIVELI^DQSLSLSVDGGNPK 1258 

I !i II: I I : illllhlll II III: |:|:|:| II I II 

Db 1201 ndplalelyqghvrlvydslssppttvysvetvndgqfhsvelvtlnqtlnlwdlcgtpk 1260 

Qy 1259 IITNLSKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQK 1318 

: I II : :|lll:lhl : :::|ll : Mill : MINIM 

Db 1261 slgklqkqpavginsplylggiptstglsalrqgtdrplggfhgcihevrinnelqdfka 1320 

Qy 1319 VPMQT-GILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCV 1377 

:| h I: III: I II II |: : III: II llllll lllll::| 

Db 1321 lppqslgvspgcksc--tvckhglcrsvekdswcecrpgvtgplcdqeardpclghrch 1378 

Qy 1378 HGTCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSG 1437 

II h II III 11:11 III : I I I I II II:! :| hill I I 

Db 1379 hgkcvatgt-symckcaegyggdlcdnkndsanacsafkchhgqchisdqgepyclcqpg 1437 

Qy 1438 YTGDSCDREISCRGERIRDYYQKOQGYMCQTTKKVSRLECRGGCAGGQCCGPLRSKRRK 1497 

::|: I :l I H :h : : I : I M : I I II :|lllll I III I llllll 

Db 1438 fsgehcqqenpclgqwrevirrqkgyascataskvpimecrggc-gpqccqptrskrrk 1496 



§ 

\ Db 



1498 YSFECTDGSSFVDET EKWKCGCTRC 1523 



I hllllllll: 



RESl 
Y271- 



:|ll I 



1497 yyfqctdgssfvee erhlecgclac 1522 

t 8 ! | 

•Y27146 standard; prjtein; 1523 AA. j 
Y27146; 

15-SEP-1999 (first entry) 

Human slit-3 protein (Seq ID No: 13 of JP11164690) . 
• t 

Vertebrate-derive^ protein; slit protein; diagnosis; cancer; nerve; 
muscle; endocrine 'system. 



XX 



JP11164690-A. 
22-J0N-1999. 

05-DEC-1997; 97JP-0335435, 

05-DEC-1997; 97JP-0335435. 

(AS AH ) ASAHI KASEI KOGYO RK. 

WPI; 1999-411830/35. 
N-PSDB; X89163. 

New vertebrate slit protein - useful for diagnosis and treatment of 
cancers in nerves, muscle and endocrine system 

Disclosure; Page 64-70; 102pp; Japanese. 

The invention relates to a vertebrate-derived protein containing an amino 
acid sequence shown in Y27137 and Y27139. The vertebrate-derived protein 
has at least 55 % homology to one of sequences shown in Y27141-Y27143, 
and has slit protein-like activity. The vertebrate slit proteins encoding 
nucleic acid sequences have at least 60% homology to nucleic acid 
sequences X89161-163. The vertebrate-derived proteins can be produced 
recombinantly by transforming host cells with expression vectors 
comprising the encoding nucleic acids. The proteins of the invention are 
for diagnosing and treating cancer of the nerves, muscle and/or endocrine 
system. 
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Best Local Similarity 66.6%; Pred. No. 1.5e-301; 

Matches 1017; Conservative 216; Mismatches 278; Indels 15; Gaps 

Qy 3 GVGWQMLS-LSLGLVLA-ILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 

III : : III I II :h III :hll ::|lllll IIHII llll III 
Db 7 gvgaavrarlalalalasvlsgppavacptkctcsaasvdchglglravprgiprnaerl 66 

Qy 61 DLNGNNITRITKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPE 120 

II: llllll Ill||::|||| I :|::| 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 : 1 II: II 
Db 67 dldrnnitritkmdfaglknlrvlhlednqvsviergafqdlkqlerlrlnknklqvlpe 126 

Qy 121 LLFLGTAKLYRLDLSENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 

iii i ii iiiiiiini mini! mini i iiiiiiiiiiiiiiiih 

Db 127 llfqstpkltrldlsenqiqgiprkafrgitdvknlqldnnhisciedgafralrdleil 186 

Qy 181 TLNNNNITRLflVASFNHMPRLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPS 240 

llllll:!: J |||||||:|| 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 II :| II I 
Db 187 tlnnnnisrilvtsfnhmpkirtlrlhsnhlycdchlawlsdwlrqrrtvgqftlcmapv 246 

Qy 241 HLRGHNVAEVQKREFVCSDEEEGHQSFMAPSCSV--LHCPAACTCSNNIVDCRGKGLTEI 298 

llll 1 1 1 : 1 ' J : 1 : 1 1 III: : II: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

Db 247 hlrgfnvadvqkkeyvcpaphs eppscnansiscpspctcsnnivdcrgkglmei 301 

Qy 299 PTNLPETITEIRLEQNTIKVIPPGAFSPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLV 358 

I llll I ll'llllhll II III: lllhllhl llll::IHIIIII:|l III 

Db 302 panlpegiveirleqnsikaipagaftqykklkridisknqisdiapdafqglksltslv 361 

Qy 359 LYGNKITELPKSLFEGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNRLQTIAKG 418 

lllllli: I ||:|| lllllllllllllllll: llll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 
Db 362 lygnkiteiakglfdglvslqllllnankinclrvntfqdlqnlnllslydnklqtiskg 421 

Qy 419 TFSPLRAIQTlfflLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSK 478 

|:||::|||:|||||||:|||IHIIIIII 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 Mill 
Db 422 lfaplqsiqtlhlaqnpfvcdchlkwladylqdnpietsgarcssprrlankrisqiksk 481 

Qy 479 KFRCSGTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNN 538 

IIMNIIMN I :ll II Hllllllll MINIM HI MM MUM 
Db 482 kfrcsgsedyrsrfssecfmdivcpekcrcegtivdcsnqklvripshlpeyvtdlrind 541 

Qy 539 NEFTVLEATGiFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHRMF 598 

II MIMIMMIM IHII Hill :: MMIM I MMI MM I ::| 

Db 542 nevsvleatgifkklpnlrkinlsnnkikevregafdgaasvqelmltgnqletvhgrvf 601 

Qy 599 KGLESLKTLMIiRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLL 658 

HI IIIIIHII I II IIH MMIIIIIIIMMM llll II IIIIHII 
Db 602 rglsglktlmlrsnligcvsndtfaglssvrllslydnrittitpgafttlvslstinll 661 

Qy 659 ANPFNCNCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCS 718 

. : 1 1 1 M 1 1 : 1 1 1 1 1 : M 1 1 : 1 1 1 : 1 i 1 1 1 1 M : 1 1 1 1 1 1 1 1 M 1 1 ! 1 1 1 1 |||:::|| 
Db 662 snpfncnchlawlgkwlrkrrivsgnprcqkpfflkeipiqdvaiqdftc-dgneesscq 720 

Qy 719 PLSRCPTECTCLDTWRCSNRGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLI 778 

HI : 1 1 1 : : 1 1 1 1 i M 1 1 1 : 1 1 : 1 : 1 : 1 1 I'll 1 1 H I I MM HUH 
Db 721 lsprcpeqctcmetwrcsnkglralprgmpkdvtelylegnhltavprelsalrhltli 780 

Qy 779 DLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPE 838 
lllll II Ml Hill: | IIIIIIIIIIIH 1 : 1 1 : M ! : 1 : 1 1 1 1 1 1 M 

Db 781 dlsnnsismltnytfsnmshlstlilsynrlrcipvhafnglrslrvltlhgndissvpe 840 

Qy 839 GAFNDLSALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPS 898 

1 : 1 1 1 1 : : 1 1 1 1 1 : 1 1 1 1 H 1 1 : : H 1 1 H 1 1 : IIIIIHII: I IIMMIMM 
Db 841 gsfndltslshlalgtnplhcdcslrwlsewvkagykepgiarcsspepmadrlllttpt 900 

Qy 899 KKFTCQGPVDTOILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACI 958 

H MMIMIMIMI IIMIIIMIII III: III III : 1 1 : 1 1 III: II 
Db 901 hrfqckgpvdinivakcnaclsspcknngtctqdpvelyrcacpysykgkdctvpintci 960 

Qy 959 SNPCKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTC 1018 

IMIMIIM : MM I I llll: ||:| 1 1 1| 1 1 1 1 1 1 1 :| 1 1 1 1 1 1 1 1 I 
Db 961 qnpcqhggtchlsdshkdgfscscplgfegqrceinpddcedndcennatcvdginnyvc 1020 



Qy 



1019 LCPPEYTGELCEEKLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQD 1078 
MM 1 1 ! I ! : 1 H I HI 1 1 1 1 1 1 III IH III M |: I III 
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l021 



icppnytgelcdevidhcvpelnlcqheakcipldkgfscecvpgysgklcetdnddcva 1 



t 



1079 NKCKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINE 1138 

:M::| I I :|illl ||:|:|| ||| III :||||| ::||||||||| 
1081 hkcrhgaqcvdtingytctcpqgfsgpfcehpppmvllqtspcdqyecqngaqcivvqqe 1140 

1139 PICQCLPGYQGEKCEKLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGD 1198 

I M II: | :||||::|||: |;||::: Mill I |: I |: 1 1 1 : 1 : 1 1 1 1 1 1 1 1 
1141 ptcrcppgfagprceklitvnfvgkdsyvelasakvrpqanislqvatdkdngillykgd 1200 

1199 KDHIAVELYRGRVRASYDIGSHPASA1YSVETINDGNFHIVELLALDQSLSLSVDGGNPK 1258 

I :|:IH:| II lh I I : MIIIMI II III: |:|:|:| II I 
1201 ndplalelyqghvrlvydslssppttvysvetvndgqfhsvelvtlnqtlnlvvdkgtpk 1260 

1259 IITNLSKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQK 1318 

: Ml : :llll:||:| : :::|ll : Mill : MINI: 
1261 slgklqkqpavginsplylggiptstglsalrqgtdrplggfhgcihevrinnelqdfka 1320 

1319 VPMQT-GILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCV 1377 

I I: I: III: I II II I: : III: II MINI ||l||::| 
1321 lppqslgvspgcksc-tvckhglcrsvekdsvvcecrpgwtgplcdqeardpclghrch 1378 

1378 HGTCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSG 1437 

II I: II III 11:11 III : I II Ml Ml :| Ml I I 

1379 hgkcvatgt-spckcaegyggdlcdnkndsanacsafkchhgqchisdqgepyclcqpg 1437 

1438 YTGDSCDREISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRK 1497 

M: Ml II: M ::|:|||:| I I :|||||| I III | Mill 
1438 fsgehcqqenpclgqvvrevirrqkgyascataskvpimecrggc-gpqccqptrskrrk 1496 

1498 YSFECTDGSSFVDEVEKWKCGCTRC 1523 

I I : I M I i 1 1 1 : 1 1 1 : ::||| I 
1497 yvfqctdgssfveeverhlecgclac 1522 



RESULT 9 
T04137 

ID Y04137 standard; Protein; 1523 AA, 
XX 
AC 
XX 
DT 
XX 



i 



Y04137; 

15-JUN-1999 (first entry) 
Human slit 3 protein. 

Human; slit-like protein; slit 3; slit 1; prevention; treatment; 
disease; spinal cord; thyroid gland; ovary; prostate; renal gland; 
small intestine; heart; trachea; thymus; lymph node; 
muscular system; colon. 



Homo sapiens. 



Key 

Peptide 



Location/Qualifiers 
1. .17 

/label- signal 
18.. 1523 
/label- slit.3 



PN JP11075846-A. 
XX 
PD 
XX 
PF 
XX 
PR 
XX 
PA 



23-MAR-1999. 

02-SEP-1997; 97JP-0236994. 



02-SEP-1997; 97JP-0236994 , 



(ASAH ) ASAHI KASEI KOGYO RK. 



WPI; 1999-257695/22. 
N-PSDB; X19946. 



New slit-like polypeptide - useful for prevention and treatment of 
diseases in spinal cord, thyroid gland, ovary, prostate, renal 



gland, small intestine, heart, trachea, thymus, lymph node, muscular 
system and colon 

Example 2; Page 27-31; 48pp; Japanese. 



The present sequence represents a human slit-like protein d 
slit 3. Slit-like proteins can be used for the prevention and the 
treatment of diseases in spinal cord, thyroid gland, ovary, prostate, 
renal gland, small intestine, heart, trachea, thymus, lymph node, 
muscular system and colon. 

Sequence 1523, AA; 



Query Match 68.7%; Score 5714.5; DB 20; Length 1523; 

Best Local Similarity 66.6%; Pred, No. 1.5e-301; 

Matches 1017; Conservative 216; Mismatches 278; Indels 15; Gaps 

Qy 3 GVGWQMLS -LSLGLVLA- ILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 

III ■ ■ M I II :|: Ml :|:ll ::|||||| Ihlll MM III 
Db 7 gvgaavrarlalalalasvlsgppavacptkctcsaasvdchglglravprgiprnaerl 66 

Qy 61 DLNGNNITRITKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLPPE 120 

II: I M I M 1 1 |||||::|||| I :|::| 1 1 1 1 1 1 1 1 1 : 1 M I M I : I ||: II 
Db 67 dldrnnitritkmdfaglknlrvlhlednqvsviergafqdlkqlerlrlnknklqvlpe 126 

Qy 121 LLFLGTAKLYRLDLSENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 

III I IMIIIIIIIII llllllll 1:111111 I IIHIIMMMIII: 
Db 127 llfqstpkltrldlsenqiqgiprkafrgitdvknlqldnnhisciedgafralrdleil 186 

Qy 181 TLNNNNITRLSVASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPS 240 

1 1:1: I IIIIIIMI llll|:||imillllllll:l II :l II I 

Db 187 tlnnnnisrilvtsfnhmpkirtlrlhsnhlycdchlawlsdwlrqrrtvgqftlcmapv 246 

Qy 241 HLRGHNVAEVQKREFVCSDEEEGHQSFMAPSCSV--LHCPAACTCSNNIVDCRGKGLTEI 298 

Mil llhllhhll III: : l|: 111111111111111 II 

Db 247 hlrgfnvadvqkkeyvcpaphs eppscnansiscpspctcsnnivdcrgkglmei 301 

Qy 299 PTNLPETITEIRLEQNTIKVIPPGAFSPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLV 358 

I INI I MM!:!! II Ml: MIMIIM IIIMIIMMIMI III 

Db 302 panlpegiveirleqnsikaipagaftqykklkridisknqisdiapdafqglksltslv 361 

Qy 359 LYGNKITELPKSLFEGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKG 418 

MIMIIlM 11:11 lllllllllllllllll: Ml IIMIMMMIMMI 
Db 362 lygnkiteiakglfdglvslqllllnankinclrvntfqdlqnlnllslydnklqtiskg 421 

Qy 419 TFSPLRAIQTMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSK 476 

|:M:MI:|IMM:IIIIMIMMI 1 1 II I II III : 1 1 1 1 1 1 II 1 1 Mil 
Db 422 lfaplqsiqtlhlaqnpfvcdchlkwladylqdnpietsgarcssprrlankrisqiksk 481 

Qy 479 KFRCSGTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNN 538 

Mllhllll!: I :|| || MMMIII HUM Ml |:|:| :||||: 
Db 482 kfrcsgsedyrsrfssecfmdlvcpekcrcegtivdcsnqklvripshlpeyvtdlrlnd 541 

Qy 539 NEFTVLEATGIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMF 598 

II MllllMllll MM MM :: M 1 1 : 1 1 : I l:M Ml I ::l 

Db 542 nevsvleatgifkklpnlrkinlsnnkikevregafdgaasvqelmltgnqletvhgrvf 601 

Qy 599 KGLESLRTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLL 658 

Ml Mltllll | M II: 1 1 1 1 r 1 1 1 1 1 1 r 1 : 1 1 1 : MM II 1 1 1 1 : 1 1 1 
Db 602 rglsglktlmlrsnligcvsndtfaglssvrllslydnrittitpgafttlvslstinll 661 

Qy ' 659 ANPFNCNCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCS 718 

:IMIM:MII:|III:|I|:|IIIIIII:MMIIIIIMMI IM::|I 
Db 662 snpfncnchlawlgkwlrkrrivsgnprcqkpfflkeipiqdvaiqdftc-dgneesscq 720 

Qy 719 PLSRCPTECTCLDTWRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLI 778 

III M'MlllilHIII: 1 1 : f : f : M ! I j 1 1 : 1 1 I Ihlll :IMI 
Db 721 lsprcpeqct:metwrcsnkglralprgmpkdvtelylegnhltavprelsalrhltli 780 

Qy 779 DLSNNRISTLSNQSFSNMTQLLTLILSYKRLRCIPPRTFDGLKSLRLLSLHGNDISWPE 838 

MM II I'M MUM I MIMIMMI |:||:|||:|:MMMI III 
Db 781 dlsnnsismltnytfsnmshlstlilsynrlrcipvhafnglrslrvltlhgndissvpe 840 
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Db 



Db 



,839 GAFNDLSALSHLAIGANPLyCDCNMQWLSDWVKSEYKEPGIABCAGPGEM&DKLLLTTPS 898 

I:l!lh:illll:| 1 1 1 : 1 1 1 : :: 1 1 1 : 1 1 1 : lllllllll: I llhllllll: 
841 gsfndltslshlfilgtnplhcdcslrwlsewvkagykepgiarcsspepmadrlllttpt 900 

899 KKFTCQGPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACI 958 

: |:||||:||:|||| 1 1 1 : 1 1 1 1 : i I! Ill: III III :||:|| III: II 
901 hrfqckgpvdinivakcnaclsspcknngtctqdpvelyrcacpysykgkdctvpintci 960 



Qy 959 SNPCKHGGTCHLKEGEEDGFWC1CADGFEGENCEVNVDDCEDNDCENNSTCVDGINNY1C 1018 

llhlllllM : :IM I I Nil: Ihl 1 1 1 1 1 II 1 1 1 1 : 1 1 1 1 1 1 1 1 1 I 
Db 961 qnpcqhggtchlsdshkdgfscscplgfegqrceinpddcedndcennatcvdginnyvc 1020 

Oy 1019 LC PP EYT GELCEEKLDFCAQDLNPCQHDSKC ILT PKGFKCDCT PG YVGEHCDIDFDDCQD 1078 

:IM 111111:1 :l I :|l ll|::||| III IM III I: |: I III 
Db 1021 icppnytgelcdevidhcvpelnlcqheakcipldkgfscecvpgysgklcetdnddcva 1080 

Qy 1079 NKCKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINE 1138 

:||::|| I I MUM ||:|:|| III Mil :||||| ::||||||||| I 
)b 1081 hkcrhgaqcvdtingytctcpqgfsgpfcehpppmvllqtspcdqyecqngaqcivvqqe 1140 



1139 PICQCLPGYQGEKCEKLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGD 1198 

I 1:1 II: I :llll::lll: 1:11::: lllllll ll:ll:lll:|:llllllll 
1141 ptcrcppgfagprceklitvnfvgkdsyvelasakvrpqanislqvatdkdngillykgd 1200 

1199 KDHIAVELYRGRVRjfSYDTGSHPASAIYSVEIINDGNFHIVELLALDQSLSLSVDGGNPK 1258 

I :|:lll:l If II: I' I : :|||||:|[| II III: |:|:|:| II I II 
1201 ndplalelyqghvrwydslssppttvysvetvndgqfhsvelvtlnqtlnlvvdkgtpk 1260 

Qy 1259 IITNLSKQSTLNFD3PLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQK 1318 

: I II : :{lll:||:| : :::IM : Mill : IIMHIM 
Db 1261 slgklqkqpavginsplylggiptstglsalrqgtdrplggfhgcihevrinnelqdfka 1320 

Qy 1319 VPMQT-GILPGCEPfiHKKVCAHGTCQPSSQAGFTCECQEGKKGPLCDQRTNDPCWNKCV 1377 

:| I: I: llh I' II II h : MM II Mill Mill:: 
Db 1321 lppqslgvspgcksc'-tvckhglcrsvekdswcecrpgwtgplcdqeardpclghrch 1378 
i 

Qy 1378 HGTCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGRCRLSGLGQPYCECSSG 1437 

II I: II Mil Ihll III : I I I I II Ihl M Ml I I 

Db 1379 hgkcvatgt-symcltcaegyggdlcdnkndsanacsafkchhgqchisdqgepyclcqpg 1437 

Qy 1438 YTGDSCDREISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRK 1497 

::|: I M | M. :|: ::|:|||:| I I MilM! I III I HUM 
Db 1438 fsgehcqqenpclgqvvrevirrqkgyascataskvpimecrggc-gpqccqptrskrrk 1496 

Qy 1498 YSFECTDGSSFVDEVEKWKCGCTRC 1523 

I hlllllllhlll: -III I 
Db 1497 yvfqctdgssfveeiierhlecgclac 1522 



■RESULT 10 

Y27143 

ID Y27143 standard; protein; 1496 AA, 
XX 

AC Y27143; 
XX 

DT 15-SEP-1999 (first entry) 
XX 

DE Human slit-3 mature protein (Seq ID No: 7 of JP11164690). 
XX 

KW Vertebrate-derived protein; slit protein; diagnosis; cancer; nerve; 

KW muscle; endocrine system. 

XX 

OS Homo sapiens. 
XX 

PN JP11164690-A. 
XX 

PD 22-JUN-1999. 
XX 

PF 05-DEC-1997; 97 JP-0335435 . 

PR 05-DEC-1997; 97 JP-0335435 . 
XX 



PA (AS AH ) ASAHI KASEI ROGYO KK, 



WPI; 1999-411830/35. 
N-PSDB; X89163. ; 

New vertebrate slit protein - useful for diagnosis and treatment of 
cancers in nerves, muscle and endocrine system 

Claim 6; Page 47-51; 102pp; Japanese. 

The invention relates to a vertebrate-derived protein containing an amino 
acid sequence shown in Y27137 and Y27139. The vertebrate-derived protein 
has at least 55 % homology to one of sequences shown in Y27141-Y27143, 
and has slit protein-like activity. The vertebrate slit proteins encoding 
nucleic acid sequences have at least 60% homology to nucleic acid 
sequences X8916M63. The vertebrate-derived proteins can be produced 
recombinantly by transforming host cells with expression vectors 
comprising .the encoding nucleic acids. The proteins of the invention are 
for diagnosing and treating cancer of the nerves, muscle and/or endocrine 
system. 

Sequence 1496; AA; 



Query Match 68.61; Score 5702.5; DB 20; Length 1496; 

Best Local Similarity 67.2%; Pred. No. 6.6e-301; 

Matches 1008; Conservative 211; Mismatches 268; Indels 13; Gaps 7; 

Qy 27 ACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITRITKTDFAGLRHLRVLQL 86 

III :!:!!■:: MM! MM Mill: llllllll 1 1 1 1 1 :: 1 1 1 1 I 

Db 6 acptkctcsaasvdchglglravprgiprnaerldldrnnitritkmdfaglknlrvlhl 65 

Qy 87 MENKISTIERGAFQDLKELERLRLNRNHLQLFPELLFLGTAKLYRLDLSENQIQAIPRKA 146 

M:M 1 1 1 i ! 1 1 1 1 1 : 1 ; . M I ! I : I II: HIM I II IMIIMMI Mill 
Db 66 ednqvsviergafqdlkqlerlrlnknklqvlpellfqstpkltrldlsenqiqgiprka 125 

Qy 147 FRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVASFNHMPKLRTFRL 206 

III 1 : 1 1 1 ! 1 1 I I i 1 1 1 M ! H I i 1 1 1! : 1 1 ! I ! 1 1 ! : 1 : I IMIIMMI II 
Db 126 frgitdvknlqldnnhisciedgafralrdleiltlnnnnisrilvtsfnhmpkirtlrl 185 

Qy 207 HSNNLYCDCHLAKLSDWLRKRPRVGLYTQCMGPSHLRGHNVAEVQKREFVCSDEEEGHQS 266 

1 1 1 : 1 ! 1 1 1 1 ': 1 1 1 1 1 1 M : i II M || I HI 1 1 1 : 1 1 1 : 1 : 1 1 
Db 186 hsnhlycdchlawlsdwlrqrrtvgqftlcmapvhlrgfnvadvqkkeyvcpaphs---- 241 

Qy 267 FMAPSCSV--LHCPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLEQNTIKVIPPGAF 324 

III: : II: I M 1 1 1 1 1 1 1 1 1 1 1 Ml MM I IMIIMMI II III 
Db 242 -eppscnansiscpspctcsnnivdcrgkglmeipanlpegiveirleqnsikaipagaf 300 

Qy 325 SPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNKITELPKSLFEGLFSLQLLLLN 384 

: MIMIIIM 1 1 1 1 : : ! 1 1 1 1 M I : M M 1 1 1 i 1 1 1 1 1 : I MMI llllllll 
Db 301 tqykklkridisknqisdiapdafqglksltslvlygnkiteiakglfdglvslqlllln 360 

Qy 385 ANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLAQNPFICDCHLKW 444 

MIIIMM III 1 1 M IS It I M 1 1 1 1 : 1 1 1 : 1 1 :: 1 1 1 : II 1 1 1 1 1 : 1 1 M 1 1 1 
Db 361 ankinclrvhtfqdlqnlnllslydnklqtiskglfaplqsiqtlhlaqnpfvcdchlkw 420 

Qy 445 LAXJYLHTNP I KT SGASCT S PRRLANKRIGQ I KSK KFRCSGT EDY RS KLSG DCFADLAC PE 504 

Mill ! M' 1 1 1 1 1 1 : ! 1 1 1 1 1 1 1 1 1 IIIMIMMMIMM I Ml II III 
Db 421 ladylqdnpietsgarcssprrlankrisqikskkfrcsgsedyrsrfssecfmdlvcpe 480 

Qy 505 KCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVLEATGIFKKLPQLRKINFSNN 564 
MMI MIIIMI Ml IMM : M 1 1 : II MMMMIMM Mill III 

Db 481 kcrcegtivdcsnqklvripshlpeyvtdlrlndnevsvleatgifkklpnlrkinlsnn 540 

Qy 565 KITDIEEGAFEGASGVNEILLTSNRLEOTQHKMFKGLESLKTLMLRSNRITCVGNDSFIG 624 

II :: 1 1 1 1 : 1 1 : I |::|| IMI I :MMI MIIIMM I II MM I 
Db 541 kikevregafdgaasvqeMtgnqletvhgrvfrglsglktlmlrsnligcvsndtfag 600 

Qy 625 LSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCYLAWLGEWLRKKRIVTGN 684 

MMMMIMMM: MM II i 1 1 1 : 1 1 1 : 1 i M 1 1 : 1 1 1 1 1 : 1 i 1 1 : 1 1 1 : i I 
Db 601 lssvrllslydnrittitpgafttlvslstinllsnpfncnchlawlgkwlrkrrivsgn 660 

Qy 685 PRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCPTECTCLDTWRCSNKGLKVL 744 
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IIIIUIMMIIIIIIIMI |||:::|| Ill : 1 1 1 : : 1 1 1 1 1 1 1 1 1 1 : | 
Db 661 prcqkpfflkeipiqdvaiqdftc-dgneesscqlsprcpeqctcraetwrcsnkglral 719 

Oy 745 PKGIPRDVIELYLDGNQFTLVPKELSNYKHLTLIDLSNNRISTLSNQSFSNMTQLLILIL 804 

|:|:|:||IIIII:M I Mil :|||||||||| II |:| :||||: I III 
Db 720 prgmpkdvtelylegnhltavprelsalrhltlidlsnnsismltnytfsnmshlstlil 779 

Qy 805 SYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGAPNDLSALSHLAIGANPLYCDCNMQ 864 

lllllllll :M:III:|:IIIIII 1 1 1 1 : 1 1 1 1 :: 1 1 1 1 1 : 1 |||:||h:: 
Db 780 synrlrcipvhafnglrslrvltlhgndissvpegsfndltslshlalgtnplhcdcslr 839 

Qy 865 WLSDWVKSEYKEPGIARCAGPGEMADRLLLTTPSKKFTCQGPVDVNILAKCNPCLSNPCK 924 

111:111: lllllllll: I lll:||||||: :| 1 : 1 1 1 1 : 1 1 : 1 1 1 1 11:11! 
Db 840 wlsewvkagykepgiarcsspepitiadrlllttpthrfqckgpvdinlvakcnaclsspck 899 

Qy 925 NDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGTCHLKEGEEDGFWCICAD 984 

1:111 III: III III :||:N III: II llhlllllll : :||| I t 
Db 900 nngtctqdpvelyrcacpysykgkdctvpintciqnpcqhggtchlsdshkdgfscscpl 959 

A 985 GFEGENCEVNVDDCEDNDCEMSTCVDGINNYTCLCPPEYTGELCEEKLDFCAQDLNPCQ 1044 
W llll: II: IIIIIIIMMIIIIIII 1:11 Mill: :| I :|| II 

Db 960 gfegqrceinpddcedndcennatcvdginnyvcicppnytgelcdevldhcvpelnlcq 1019 

Qy 1045 HDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYICICPEGYSG 1104 

|::IH III hi III h I: I III :lh:ll I I :|llll 1 1 : 1 : 1 1 
Db 1020 heakcipldkgfscecvpgysgklcetdnddcvahkcrhgaqcvdtingytctcpqgfsg 1079 

Qy 1105 LFCEFSPPMVLPRTSPCDHPDCQNGAQCIVRINEPICQCLPGYQGEKCEKLVSVNFINKE 1164 

III INI :|llll -lllllllll II hi II: I :lllh:lll: |: 
Db 1080 pfcehpppmvllqtspcdqyecqngaqcivvqqeptcrcppgfagprceklitvnfvgkd 1139 

Qy 1165 SYLQ I PS AKVRPQTN ITLQ I ATDEDSG ILLYKGDKDH IAVELYRGRVRASYDTGSHPAS A 1224 

II::: lllllll J 1 : 1 1 : 1 1 1 : 1 : 1 1 1 1 1 1 1 1 I :|:|||:| II l|: I I : 
Db 1140 syvelasakvrpqanislqvatdkdngillykgdndplalelygghvrlvydslsspptt 1199 

Qy 1225 iysvetinixsnfhivellaldqslslsvdggnpkiitnlskqstWdsplyvggmpgks 1284 

Mllhlll II III: |:|:|:| Mil: Ml | :|lll:lhl : 
Db 1200 vysvetvndgqfhsvelvtlnqtlnlvvdkgtpkslgklqkqpaT|giDSplylggiptst 1259 

Qy 1285 NVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQT-GILPGCEPCHKKVCAHGICQ 1343 

:::IH > Mill : ll:lllll: :l I: I: III: I II II I: 
Db 1260 glsalrqgtdrplggfhgcihevrinnelqdfkalppqslgvspgcksc--tvckhglcr 1317 

Qy 1344 PSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGTCLPINAFSYSCKCLEGHGGVLCD 1403 
: III: II llllll INI::! II h II III IMI III 

Db 1318 svekdsvvcecrpgwtgplcdqeardpclghrchhgkcvatgt-symckcaegyggdlcd 1376 

11404 EEEDLFNPCQAIRCRHGRCRLSGLGQPYCECSSGYTGDSCDREISCRGERIRDYYQRQQG 1463 
: I I I I II 11:1 :| hill I |::|: I :| I I: :|: ::|:| 
1377 nkndsanacsafkchhgqchisdqgepyclcqpgfsgehcqqenpclgqvvrevirrqkg 1436 

Qy 1464 YAACQTTKRVSRLECRGGCAGGQCCGPLRSKRRKYSFECTDGSSFVDEVEKWKCGCTRC 1523 

M: I || :IHII! I III | IMIIII hlllllllhlll: ::||| I 
Db 1437 yascataskvpimecrggc-gpqccqptrskrrkyvfqctdgssfveeverhlecgclac 1495 



RESULT 11 
Y04136 

ID Y04136 standard; Protein; 1496 AA. 
XX 

AC Y04136; 
XX 

DT .15-JUN-1999 (first entry) 
XX 

Human slit 3 mature protein. 



Human; slit-like protein; slit 3; slit 1; prevention; treatment; 
disease; spinal cord; thyroid gland; ovary; prostate; renal gland; 
small intestine; heart; trachea; thymus; lymph node; 
muscular system; colon. 

Homo sapiens. 



JP11075846-A. . 
23-MAR-1999. 

02-SEP-1997; 97JP-0236994. 

02-SEP-1997; 97JP-0236994. 

(ASAH ) ASAHI RASEI ROGYO RR. 

WPI; 1999-257695/22. 
N-PSDB; X19946.- 

New slit-like polypeptide - useful for prevention and treatment of 
diseases in spinal cord, thyroid gland, ovary, prostate, renal 
gland, small intestine, heart, trachea, thpus, lymph node, muscular 
system and colon 

Claim 1; Page 17-21; 48pp; Japanese. 

The present sequence represents a human slit-like protein designated 
slit 3. Slit-like proteins can be used for the prevention and the 
treatment of diseases in spinal cord/ thyroid gland, ovary, prostate, 
renal gland, small intestine, heart, trachea, thymus, lymph node, 
muscular system and colon. 

Sequence 1496 AA; 



Query Match 68.64; Score 5702.5; DB 20; Length 1496; 

Best Local Similarity 67.2%; Pred. No. 6.6e-301; 

Matches 1008; Conservative 211; Mismatches 268; Indels 13; Gaps 7; 

3y 27 ACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITRITRTDFAGLRHLRVLQL 86 

III :'l:ll ::llllll MM llll lllll: 1 1 1 1 1 1 1 1 |||||::|||| I 
Db 6 acptkctcsaa'svdchglglravprgiprnaerldldrnnitritkmdfaglknlrvlhl 65 

3y 87 MENRISTIERGAFQDLKELERLRLNRNHLQLFPELLFLGTAKLYRLDLSENQIQAIPRRA 146 

:|::| 1 1 H II 1 1 1 1 : 1 1 II 1 1 1 : 1 M lllll I II llllllllll lllll 
Db 66 ednqvsviergafqdlkqlerlrlnknklqvlpellfqstpkltrldlsenqiqgiprka 125 



147 FRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVASFNHMPKLRTFRL 206 

III MINN I llllllllllllllll:llllllll:|: I II 1 1 ! II : i I II 

126 frgitdvknlqldnnhisciedgafralrdleiltlnnnnisrilvtsfnhmpkirtlrl 185 

207 HSNEYCDCHLAWLSDWLRKRPRVGLYTQCMGPSHLRGHNVAEVQRREFVCSDEEEGHQS 266 

ll|:|||||irilllllll:| II :l II I llll I II : I M : 1 : 1 1 

186 hsnhlycdchlawlsdwlrqrrtvgqf tlcmapvhlrgf nvadvqkkeyvcpaphs - - - - 241 

267 FMAPSCSV-LHCPAACTCSNNIVDCRGRGLTEIPTNLPETITEIRLEQNTIKVIPPGAF 324 

III: : II: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III llll I MINIM II III 

242 -eppscnansiscpspctcsnnivdcrgkglmeipanlpegiveirleqnsikaipagaf 300 

325 SPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNKITELPKSLFEGLFSLQLLLLN 384 

: NIMH'':! 1 1 1 1 :: 1 1 1 1 1 1 1 1 : 1 1 lllllllllll: I M : 1 1 llllllll 

301 tqykklkridisknqisdiapdafqglksltslvlygnkiteiakglfdglvslqlllln 360 



Qy 385 ANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLAQNPFICDCHLKW 444 

IIIIIMMMI llllllllllllllhll l:||::|||:!llll!|:|!lll!l 
Db 361 ankinclrvntfqdlqnlnllslydnklqtiskglfaplqsiqtlhlaqnpfvcdchlkv 420 

Qy 445 LADYLHINPIETSGARCTSPRRLANKRIGQIKSKKFRCSGTEDYRSKLSGDCFADLACPE 504 

lllll IIIIMIIhlllllllMI llllllllllhlllll: I :H II III 
Db 421 ladylqdnpietsgarcssprrlankrisqikskkfrcsgsedyrsrfssecfmdlvcpe 480 

Qy 505 RCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVLEATGIFRKLPQLRKINFSNN 564 

■ ' ■ hum mum m hhi hiihii mimmm urn in 

Db 481 kcrcegtivdcsnqklvripshlpeyvtdlrlndnevsvleatgifkklpnlrkinlsnn 540 

Qy 565 K I TDIEEG AFSGASG VNE ILLT SNRLENVQ HKMFKGLESLKT LMLRSNRITCVGNDSF IG 624 

II :: 1 1 1 1 :J 1 : I Mil hll I ::hll MINI! I I ||:| I 
Db 541 kikevregafdgaasvqelmltgnqletvhgrvfrglsglktlmlrsnligcvsndtfag 600 
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t 

Oy 
Db 
oy 
Db 



625 LSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCYLAWLGEWLRKKRIVTGN 684 

IIIIMIIIIhlll: Mil || Illl:|||:|||||||:|||||:||||:|||:|| 

601 lssvrUslydnrittitpgafttlvslstinllsnpfncnchlawlgkwlrkrrivsgn 660 

685 PRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCPTECTCLDTWRCSNKGLKVL 744 

NIIMNMNNIMN |||:::|| III : 1 1 1 :: 1 1 1 1 1 1 1 1 1 1 : 

661 prcqkpfflkeipiqdvaiqdftc-dgneesscqlsprcpeqctcmetvvrcsnkglral 719 

745 PKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLIDLSNNRISTLSNQSFSNMTQLLILIL 804 

INIMIMMMM I ||:||| :|||||||||| II |:| :||||: I II 

720 prgrapkdvtelylegnhltavprelsalrhltlidlsnnsismltnytfsniiishlstlil 779 

805 SYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGAFNDLSALSHLAIGANPLYCDCNMQ 864 

lllllllll l:ll:lll:|:lllllll 1 1 1 1 : 1 1 1 1 :: 1 1 1 1 1 : 1 |||:|||::: 

780 synrlrcipvhafnglrslrvltlhgndissvpegsfndltslshlalgtnplhcdcslr 839 

865 WLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQGPVDVNILAKCNPCLSNPCK 924 

llhlll: lllllllll: I Ihllllll: :| 1 : 1 1 1 1 : 1 1 : 1 1 1 1 llhlll 

840 wlsewvkagykepgiarcsspepmadrlllttpthrfqckgpvdinivakcnaclsspck 899 

925 NDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCRHGGTCHLKEGEEDGFWCICAD 984 

1:111 III: III III :||:|| ll|: I |||:||||||| : :||| I I 

900 nngtctqdpvelyrcacpysykgkdctvpintciqnpcqhggtchlsdshkdgfscscpl 959 

985 GFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFCAODLNPCQ 1044 

UN: Ihl Mllllllllhlllllllll hill MIMhl :| I :|| II 

960 gfegqrceinpddcedndcennatcvdginnyvcicppnytgelcdevidhcvpelnlcq 1019 

1045 HDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGABCTDAVNGYTCICPEGYSG 1104 

|::NI III |:| III |: |: I II :||::|| I I :||||| ||:|:|| 

1020 heakcipldkgfscecvpgysgklcetdnddcvahkcrhgaqcvdtingytctcpqgfsg 1079 

1105 LFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGYQGEKCEKLVSVNFINKE 1164 

III Hill :lllll -lllllllll II 1:1 II: I :||||::|||: h 

1080 pfcehpppmvllqtspcdqyecqngaqcivvqqeptcrcppgfagprceklitvnfvgkd 1139 

1165 SYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELYRGRVRASYDTGSHPASA 1224 

II::: lllllll 1 1 : 1 1 : 1 1 1 : 1 : 1 1 1 1 1 1 II I :|:lll:l II II: II : 

1140 syvelasakvrpqaDlslqvatdkdngillykgdndplalelyqghvrlvydslsspptt 1199 

1225 IYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLSKQSTLNFDSPLYVGGMPGKS 1284 

:IMI|:MI II III: |:|:|:| II I II : Ml : :||||:||:| : 

1200 vysvetvndgqfhsvelvtlnqtlnlwdkgtpkslgklqkqpavglnsplylggiptst 1259 

1285 NVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQT-GILPGCEPCHKKVCAHGTCO 1343 
• :::|M : lllll : Ihllllh :| h h Hh.l II II I: 

1260 glsalrqgtdrp^gfhgcihevrlnnelqdfkalppqslgvspgckac-tvckhglcr 1317 

1344 PSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGTCLPINAFSYSCRCLEGHGGVLCD 1403 

: III: II Mill UNI:: II |: I III ||:|| III 

1318 svekdswcecrpgvtgplcdqeardpclghrchhgkcvatgt-synickcaegyggdlcd 1376 

1404 EEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSGYTGDSCDREISCRGERIRDYYQKQQG 1463 

: I lllll Ihl :l 1:11 I |::|: I :| I |: :|: ::|:| 

1377 nkndsanacsaf kchhgqchisdqgepyclcqpgf sgehcqqenpclgqvvrevirrqkg 1436 

1464 YAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSFECTDGSSFVDEVEKWKCGCTRC 1523 

IN I II :llllll I III I lllllll |:||||||||:|||: ::||| I 

1437 yascataskvpimecrggc-gpqccqptrskrrkyvfqctdgssfveeverhlecgclac 1495 



RESULT 12 
Y14142 

ID Y14142 standard; Protein; 1523 AA. 
XX 

AC Y14142; 
XX 

DT 22-J0W999 (first entry) 
XX 

DE Human Slit protein sequence. 
XX 

KW Slit; human; 
XX 



W09923219-A1. , 
14-MAY-1999. j 
29-OCT-1998; 98WO-US22845 . 



13-AUG-1998; 
31-OCT-1997; 



98OS-0096420. 
97OS-0063946. 



(OSIR-) OSIRIS THERAPEUTICS INC. 
Connolly T, Rajput B; 



WPI; 1999-337485/28. 
N-PSDB; X61026.; 



New human slit polypeptide and polynucleotide 
Claim 12; Fig 2; 90pp; English. 

This sequence is the human slit polypeptide of the invention. 
The slit protein is useful for patients in need of the slit polypeptide, 
and its antagonist is useful for patients in need of inhibition of 
the slit polypeptide. Diagnosis of a disease or susceptibility to a 
disease is achieved by determining the presence of a mutation in the slit 
coding sequence; or determining the presence of slit by detecting 
expression levels of the slit protein. Anti-slit antibodies are useful 
for diagnosis of conditions associated with levels of slit protein. 

Sequence 1523 AA; 



Query Match 67,54; Score 5611.5; DB 20; Length 1523; 

Best Local Similarity 65.74; Pred. No. 5.7e-296; 

Matches 1002; Conservative 216; Mismatches 293; Indels 15; Gaps 

3 GVGWQMLS-LSLGLVLA-ILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 

III : : l;l I II :|: III Nil NIMH IhMI INI Ml 
7 gvgaavrarlalalalasvlsgppavacptkctcsaasvdchglglravprgiprnaerl 66 

61 DLNGNNITRITKTDFAGLRBLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPE 120 

II: IIIMII 1 1 1 1 1 1 1 1 1 I :|::| 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 ! 1 1 : ||: 
67 dldrnnitritkmdfaglknlrvlhlednqvsviergafqdlkqlerlrlnknklqvlpe 126 



121 LLFLGTAKLYRLDLSENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 

III I II lllllllll lllllll INIIII I IIIIIIIIIIIMIIhl 
127 llfqstpkltrldlsenqiqgiprkafrgitdvknlqldnnhisciedgafralrdleil 186 

181 TLNNNNITRLSVASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPS 240 

1111111:1: I MIIIIM I II II : 1 1 1 II 1 1 II I II II I : I II :| II I 
187 tlnnnnisrilvtsfnhnipkirtlrlhsnhlycdchlawlsdwlrqrrtvgqftlcmapv 246 

241 HLRGHNVAEVQKREFVCSDEEEGHQSFMAPSCSV--LHCPAACTCSNNIVDCRGKGLTEI 298 

MM ll:IM:|:| |||: : ||: 111111111111111 I 

247 hlrgfnvadvqkkeyvcpaphs eppscnansiscpspctcsnnivdcrgkglmei 301 

299 PTNLPETITEIRLEQNTIKVIPPGAFSPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLV 358 

I MM I 111111:1 || |||: MINIMI IIM::IMMMI:M III 
302 panlpegiveirleqnsikaipagaftqykklkridisknqisdiapdafqglksltslv 361 

359 LYGNKITELPKSLFEGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKG 418 

MIIIIM: I 11:11 IIIIIIIIIIIIIIM: MM 1 1 1 1 II I II I II 1 1 1 : 1 1 
362 lygnkiteiakglfdglvslqllllnankinclrvntfqdlqnlnllslydnklqtiskg 421 

419 TFSPLRAIQTMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSK 478 

|:M::IM:IIIMM:MIMMMIM 1 1 1 1 II II 1 1 : 1 II 1 1 1 1 1 II lllll 
422 lfaplqsiqt'lhlaqnpfvcdchlkwladylqdnpietsgarcssprrlankrisqiksk 481 

479 KFRCSGTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNN 538 

MINIUM: I :M II MINN lllllll Ml MIM MUM 
482 kfrcsgsedyrsrfssecfmdlvcpekcrcegtivdcsnqklvripshlpeyvtdlrlnd 541 
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Qy 539 NEFTVLEATGIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMF 598 

II :IHIIIIIIIII lllll MM :: ||||:||: I |::|| Ml I ::| 
Db ,542 nevsvleatgifkklpnlrkinlsnnkikevregafdgaasvqelmltgnqletvhgrvf 601 

Qy 599 kgleslktlmlSSnritcvgndsfiglssvrllslydnqittvapgafdtlhslstlnll 658 

M lllllllll |:|| MM I II Ml 1 III II I :M ; : MM N I 
Db 602 rglsglktlmlrsnliscvsndtfaglssvrllslydnrittitpgafttlvspvhhkpp 661 

Qy 659 ANPFNCNCYLAWLGEWLRKRRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCS 718 

I ! 1 1 1 : 1 1 1 1 : i 1 1 : 1 ! I ! I M i : i I ! I ! I ! M M 1 1 : 1 1 |||:::|| 

Db 662 vqplqlqlplawlgkwlrkrrivsgnprcqkpfflkeipiqdvaiqnftc-dgneesscq 720 

Qy 719 PLSRCPTECTCLDTWRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLI 778 

III MIMMIMIIIIII: ll:|:|:|||||||:|| | 1 1 : 1 1 1 : 1 1 1 1 1 
Db 721 lsprcpeqctcmetvvrcsnkglralprgmpkdvtelylegnhltavprelsalrhltli 780 

Qy 779 DLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLR1LSLHGNDISVVPE 838 
^ lllll II hi MUM I IMIIMIIMM hlhllhhhlhll III 
B 781 dlsnnsisraltnytfsnmshlstlilsynrlrcipvhafnglrslrvltlhgndissvpe 840 

% 839 GAFNDLSALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPS 898 
hllll::|||lhl llhlll:::llhllh lllllllll: I |||:||||||: 
Db 841 gsfndltslshlalgtnplhcdcslrwlsewkagykepgiarcsspepmadrlllttpt 900 

Qy 899 KKFTCQGPVDVNIIARCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACI 958 

M IMIIIMIMIII J 1 1 : U 1 1 : 1 1 1 III: III III MIM III: II 
Db 901 hrfqckgpvdinivakcnaclsspcknngtctqdpvelyrcacpysykgkdctvpintci 960 

Qy 959 SNPCKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTC 1018 

IMMMIMI : MM I I MIM MM 1 1 II 1 1 M M M II II 1 1 1 1 I 
Db 961 qnpcqhggtchlsdshkdgfscscplgfegqrceinpddcedndcennatcvdginnyvc 1020 

Qy 1019 LCPPEYTGELCEEKLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQD 1078 

HII 1 1 1 i I : M I Ml 1 1 1 : : I M III IM III |: |: | II 
Db 1021 icppnytgelcdevidhcvpelnlcqheakcipldkgfscecvpgysgklcetdnddcva 1080 

Qy 1079 NKCKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINE 1138 

Mi:M I I MMM 11:1:11 III MM MIM! ::||||||||| I 
Db 1081 hkcrhgaqcvdtingytctcpqgfsgpfcehpppmvllqtspcdqyecqngaqcivvqqe 1140 

Qy 1139 PICQCLPGYQGE8CEKLVSVNFINKESYLQIPSAKVRPQTNIILQIATDEDSGILLYKGD 1198 

I hi Ih I JMMMM: |:||::: MMMI 1 1 : 1 1 : 1 1 1 : 1 : 1 1 1 1 1 1 1 1 

Db 1141 ptcrcppgfagpleklitvnfvgkdsyvelasakvrpqanislqvatdkdngillykgd 1200 

Qy 1199 KDHIAVELYRGRVMSYDTGSHPASAIYSVETIOTGNFHIVELLALDQSLSLSVDGGNPK 1258 
_ I :l«ll|:| I'l II: I I : :lllll:lll II II:: |:|:|:| II I II 

Mj 1201 ndplajielyqghvrlvydsvssppttvysvetvndgqfhsvevvtlnqtlnlwdkgtpk 1260 

1259 IITNLsKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLnNSELQDFQK 1318 
: fll : : 1 1 1 1 : M : I : :::||| : Mill : IMIMI: 
Db 1261 slgkfqkqpavginsplylggiptstglsalrqgtdrplggfhgcihevrinnelqdfka 1320 

Qy 1319 VPMQT-GILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCV 1377 

M I: I: III: I II II |: : MM I Mill IMIMM 
Db 1321 lppqslgvspgcksc-tvckhglcrsvekdswcecrpgwtgplcdqeardpclghrch 1378 

Qy 1378 HGTCLPINAFSYSCKCIEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSG 1437 

II M || III Ml III : I I | Ml MM :| |:||| I 

Db 1379 hgkcvatgt-symckcaegyggdlcdnkndsanacsafkchhgqchisdqgepyclcqpg 1437 

Qy 1438 YTGDSCDREISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRK 1497 

: = h I =1 IM :|: ::|:|||:| | I MMIM I III I MMM 
Db 1438 fsgehcqqenpclgqvvrevirrqkgyascataskvpimecrggc-gpqccqptrskrrk 1496 

Qy 1498 YSFECTDGSSFVDEVEKWKCGCTRC 1523 

I hlllllllhllh ::||| I 
Db 1497 yvfqctdgssfveeverhlecgjclac 1522 



RESULT 13 
W46966 

ID W46966 standard; Protein; 1534 AA. 



W46966; 

06-JUL-1998 (first entry) 

Amino acid sequence of a human slit-like polypeptide. 

Slit-like protein; human; diagnosis; treatment; brain-specific disease; 
cancer; antibody. 

Homo sapiens, ■ 



Key 

Peptide 



Protein 



' 'Location/Qualifiers 
I1..26 

/note- "signal peptide" 
27.. 1534 

./note- "mature protein" 



JP10087699-A. ' 
07-APR-1998. M 

15- JOL-1997;' 97JP-0205351. 

16- JDL-1996; 96JP-0186219. 

(AS AH ) ASAHI KASEI KOGYO KK. 

WPI; 1998-267127/24. 
N-PSDB; V16978, 



Human Slit-like protein - useful for diagnosis and treatment of 
brain-specific diseases and cancers 

Disclosure; Pages 31-35; 45pp; Japanese. 

The present sequence represents a novel human slit-like protein (the 
mature protein is claimed in Claim 1). The slit-like polypeptide is 
useful for diagnosis and treatment of brain-specific diseases and 
cancers. Antibodies directed against the protein, or its fragments 
can also be used for diagnosing cancer. 

Sequence 1534 AA; 



Query Match ' 67.3%; Score 5597; DB 19; Length 1534; 

Best Local Similarity 65.3%; Pred. No. 3.5e-295; 

Matches 992; Conservative 228; Mismatches 283; indels 16; Gaps 

Qy 15 LVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITRITKTD 74 

IM ; Mil MM|:||||| I ::: 1 : 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 | | 
Db 21 llwaaawrlgasacpalctctgttvdchgtglqaipkniprnterlelngnnitrihknd 80 

Qy 75 FAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPELLFLGTAKLYRLDL 134 

llll: 1 1 1 1 1 1 1 1 1 : 1 MIMI 1 : 1 1 1 1 1 i I ; 1 1 1 I : lllll I llll 
Db 81 faglkqlrvlqlmenqigavergafddmkelerlrlnrnqlhmlpellfqnnqalsrldl 140 



Qy 



135 SENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVAS 194 
III llllllllllll IMIMM 1 1 1 1 1 1 1 : 1 i 1 1 1 1 1 llllllllllll : |:| 
Db 141 senaiqaiprkafrgatdlknlrldknqiscieegafralrglevltlnnnnittipvss 200 

Qy 195 FNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPSHLRGHPAEVQRRE 254 

I III Ml I II MM hi :l Mil M M M I : II MIMII IM III MMM! I 
Db 201 fnhmpklrtfrlhsnhlfcdchlawlsqwlrqrptiglftqcsgpaslrglnvaevqkse 260 



255 FVCSDEEEGHQSFMAPSCSVL-HCPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLE 312 

MM : I : |:|:: III lllll 1111111111 II IMIhllllll 
261 fscsgqgeag,r;--vptctlssgscpamctcsngivdcrgkgltaipanlpetmteirle 317 



Qy 313 QNTIKVIPPGAFSPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNKITELPKSLF 372 

I II llllllllhinnilllllhhllllllllllllllllinilhlh :| 
Db 318 lngiksippgafspyrklrridlsnnqiaeiapdafqglrslnslvlygnkitdlprgvf 377 
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• Qy 373 EGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLyDNKLQTIMGTFSPLRAIQIMHLA 432 
ll::|IMIII|llll|:| Mill! 1 1 : 1 1 1 1 1 1 1 1 : 1 :: 1 1 1 1 1 : MINIM 
Db 378 gglytlqllllnankincirpdafqdlqnlsllslydnkiqslakgtftslraiqtlhla 437 

Qy 433 QNPFICElCHLKWLADYLHTNPIETSGARCTSPRRIjANRRIGQIRSKRFRCS G 484 

llllllll:'lll|l|:| IMIIIIIIII MIIIIIIIIIIIIHIIIII | 
Db 438 qnpficdcnlkwladflrtnpietsgarcasprrlankrigqikskkfrcsakeqyfipg 497 

Qy 485 TEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQyTAELRLNNNEFTVL 544 

INI :l: :| :|: II Mil |:||: I Mil III IIIIIMII ::| 
Db 498 tedy--qlnsecnsdvvcphkcrceanvvecsslkltkiperipqstaelrlnnneisil 555 

' Qy 545 EATGIFKKliPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKGLESL 604 
lllhllli; 1:111 lll|:::l|:||||||: |:|: ||:|:||::: ||:||: | 
Db 556 eatgmfkkjthlkkinlsnnkvseiedgafegaasvselhltanqlesirsgrafrgldgl 615 

Qy 605 KTLMLRSNltlTCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLIiANPFNC 664 

:|lll|:||;|:|°|||| | : 1 1 1 II 1 1 1 1 1 M 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
Db 616 rtlmlrnniiscihndsftglrnvrllslydnqittvspgafdtlqslstlnllanpfnc 675 

fc' 665 NCYUWLGEWLRRRRIVTGNPRCQRPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCP 724 
W M lllll'lllhilllllllll I ll::il:||ll I |::| | | :|| 

. Db 676 ncqlawlggvlrkrkivtgnprcqnpdflrqiplqdvafpdfrceegqeeggclprpqcp 735 

Qy 725 TECTCUJTVVRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTIjIDLSNNR 784 

II Mlllllllll I: lllll|::||||lllllllllll :|| :|:| 1 : 1 1 1 1 1 : 
j Db 736 qecacldtvvrcsnkhlralpkgipknvtelyldgnqftlvpgqlstfkylqlvdlsnnk 795 

Qy 785 ISTLSHQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGAFNDL 844 

11:111 Ihlhll lllllll hill I Ihlllllllllllll : II I I: 
Db 796 isslsnssftnmsqlttlilsynalqcipplafqglrslrllslhgndistlqegifadv 855 

Qy 845 SALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQ 904 

::IIIIIIIIIIMIII:::||| III: lllllllllll :| I ! I M 1 1 : 1 1 1 I 
Db 856 tslshlaiganplycdchlrwlsswvktgykepgiarcagp'qdraegklllttpakkfecq 915 

Qy 905 GPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDYPIHACISNPCKH 964 

II : : Ml: 1 1 1 : II : I lll::||:: III II 1 : 1 1 : i I : I :::| I ||:: 
Db 916 gpptlavqakcdlclsspcqnqgtchndplevyrcacpsgykgrdcevslnscssgpcen 975 

Qy 965 GGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEY 1024 

IMII :lll: I I I llll I II III I: I I Nil: Nil I :| 
Db 976 ggtchaqegedapftcscptgfegptcgvntddcvdhacanggvcvdgvgnytcqcplqy 1035 

Qy 1025 tgei£eek£dfcaqdlnpcqhdskciltpkgfrcdctpgyvgehcdidfddcqdnkcrng 1084 

' I: M: }\ h Mllli:::|: II I :|:| III |::| : llhhMII 
Db 1036 egkaceqlvdlcspdlnpcqheaqcvgtpdgprcecmpgyagdncsenqddcrdhrcqng 1095 

fe' 1085 AHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCL 1144 
■ I I I II I:hl Mill II I : I: III: :||||| |; : | |:|||l 

~b 1096 aqcmdevnsysclcaegysgqlcelpphlpapk-spcegtecqngancvdqgnrpvcqcl 1154 

Qy 1145 PGYQGERCERLVSVNFINRESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDRDHIAV 1204 

II: I :IHI:IIII:::::||| : |: llll|::| Ihlllll II UN 
Db 1155 pgfggpecekllsvnfvdrdtylqftdlqnwpranltlqvstaedngillyngdndhiav 1214 

Qy 1205 ELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLS 1264 

llhl M III IMIIII Mill I ll|:| II ::||:||h| : I 
Db 1215 elyqghvrvsydpgsypssaiysaetindgqfhtvelvafdqmvnlsldggspmtnidnfg 1274 

Qy 1265 KQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQTG 1324 

I IN lllllllll I I: I III 1 1 1 1 1 1 1 1 II 1 : 1 1 1 1 1 I I: I 

Db 1275 khytlnseaplyvggmpvdvnsaafrlwqilngtgfhgcirnlyinnelqdftktqmkpg 1334 

Qy 1325 ILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGTCLPI 1384 

::|IMI I I II III:: I I : IN III : II MINI |:|: 
Db 1335 vvpgcepcrklyclhgicqpnatpgpmchceagwglhcdqpadgpchghkcvhgqcvpl 1394 

Qy 1385 NAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIRCKHGRCRLSGLGQPYCECSSGYTGDSCD 1444 

:| 1111:1 :|: ! II:: I II: ::| II |: II :| I |::|: |: 
Db 1395 dalsyscqcqdgysgalcnqagalaepcrglqclhghcqasgtkgahcvcdpgfsgelce 1454 

Qy 1445 REISCRGERIRDYYQRQQGYAACQTTRKVSRLECRGGCAGGQCCGPLRSRRRRYSFECTD 1504 



Db 



:| Ml: :l|::| hill III!: :| :|||l Mill ||||::|||:l 
1455 qesecrgdpvrdfhqvqrgyaicqttrplswvecrgscpgqgccqglrlkrrkftfecsd 1514 



Qy 1505 GSSFVDEVERWRCGCTRC 1523 

1:11 hill III! I 
Db 1515 gtsfaeevekptkcgcalc 1533 



RESULT 14 
Y27144 

ID Y27144 standard; protein; 1534 AA. 
XX 

Y27144; 



15-SEP-1999 (first entry) 

Human slit-1 protein (Seq ID No: 11 of JP11164690). 

Vertebrate-derived protein; slit protein; diagnosis; cancer; nerve; 
muscle; endocrine system. | 



Homo sapiens. 

XX 

PN JP1U64690-A. 
XX 

PD 22-JON-1999. 
XX 
PF 
XX 
PR 
XX 
PA 
XX 



05-DEC-1997; 97JP-0335435, 
05-DEC-1997; 97JP-0335435. 



(ASAH ) ASAHI RASEI ROGYO KR. 



WPI; 1999-411830/35. 
N-PSDB; X89161.. 

New vertebrate slit protein - useful for diagnosis and treatment of 
cancers in nerves, muscle and endocrine system 

Disclosure; Page 51-57; 102pp; Japanese. 

The invention relates to a vertebrate-derived protein containing an amino 
acid sequence shown in Y27137 and Y27139. The vertebrate-derived protein 
has at least 55 % homology to one of sequences shown in Y27141-Y27143, 
and has slit protein-like activity. The vertebrate slit proteins encoding 
nucleic acid sequences have at least 60% homology to nucleic acid 
sequences X89161-163. The vertebrate-derived proteins can be produced 
recombinantly by transforming host cells with expression vectors 
comprising the encoding nucleic acids. The proteins. of the invention are 
for diagnosing and treating cancer of the nerves, muscle and/or endocrine 
system, 



Sequence 1534 AA; 



Query Match 67.2%; Score 5589; DB 20, 

Best Local Similarity 65.2%; Pred. No. 9.5e-295, 
Matches 991; Conservative 228; Mismatches 284 



Db 



Length 1534; 
Indels 16; Gaps 



15 LVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITRITKTD 74 

I: I :: ' llll 1 : 1 : 1 : 1 1 1 1 1 1 I ::: 1 : 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 I I 
21 llwaaawrlg&sacpalctctgttvdchgtglqaipknipmterlelngnnitrihknd 80 

75 FAGLRHLRVLQLMENKISTIERGAFQDLRELERLRLNRNHLQLFPELLFLGTAKLYRLDL 134 

llll: lllllllll: :||||| MINIMI I : III I llll 
81 faglkqlrvlqlmenqigavergafddmkelerlrlnrnqlhmlpellfqnnqalsrldl 140 



Qy 135 SENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVAS 194 

III lllllllllll! MM |||||||:||||||| lllllllllll! : M 
Db 141 senaiqaiprkafrgatdlknlrldknqiscieegafralrglevltlnnnnittipvss 200 



Qy 



195 FNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPSHLRGHNVAEVQKRE 254 

1 1 1 1 M 1 1 1 M 1 1 1 1 M : 1 1 1 1 1 1 1 1 1 Hhll : hill lh III lllllll I 
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Db 201 fnhmpklrtfrlhsnhlfcdchlawlsqwlrqrptielftqcsgpaslrglnvaevqkse 260 

Qy 255 FVCSDEEEGHQSFMAPSCSVL--HCPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLE 312 

111 = 1 : M:: III III lllllllll II l||l|:|||lll 
Db 261 facsgqgeagr— vptctlssgscpamctcsnglvdcrgkgltaipanlpetiteirle 317 

Qy 313 QNTIKYIPPGAFSPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNKITELPKSLF 372 

I M llllllllhlllllllllllhlHNIIIIIIIIIIIIIIIIIIhll: :| 

Db 318 lngikgippgafspyrklrrldlsnnqiaeiapdafqglrslnslvlygnkitdlprgvf 377 

Qy 373 EGLFSUJLHiLNANKINCLRVDftFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLA 432 

ll::|:IIIIIIMIIII: llllll 1 1 : 1 1 1 1 1 1 [ I : I : : 1 1 1 1 1 : ||||||:||| 
Db 378 gglytlqllllnankiDCirpdafqdlqnlsllslydnkiqslakgtftslralqtlhla 437 

Qy 433 QNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKKFRCS G 484 

iiiirihiiiiii: Milium iiiiiiMiiiiiiiiiiiii i 

Db 438 qnpfKjdcnlkwladflrtnpietsgarcasprrlankrlgqikskkfrcsakeqyfipg 497 

^Qy 485 TEDYRSKLSGDCFADMCPEKCRCEGTTVDCSNQKLNRIPEHIPQYTAELRLNNNEFTVL 544 
1 Ml I :| :|: II Mill |:|[; || |||| ||| lllllllll ::| 

pb 498 tedy-qlnsecnsdvvcphkcrceanvvecsslkltkiperipqstaelrlnnneisll 555 

Qy 545 EATGIFKKLPQLRRINFSNNRITDIEEGAFEGASGVNEILLTSNRLENVQHKMFRGLESL 604 

IMh Mil Iiiirihiiiiii: IH: l|:|:||::: ||:||: 

Db 556 eatgrafkklthlkkinlsnnkvseledgafegaasvselhltanqlesirsgnifrgldgl 615 

Qy 605 KTLMLRSKRITCVGHDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNC 664 
:I!III:IN:|: Nil I I'M 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 616 rtlmlrDnriscihndsftglrnvrllslydnqlttvspgafdtlqslstlnllanpfnc 675 

Qy 665 NCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDHSCSPLSRCP 724 

II IMII llll::|llllllll | ||::||:|||| [I |::| :: | | :| 
Db 676 ncqlawlggwlrkrkivtgnprcqnpdflrqlplqdvafpdfrceegqeeggclprpqcp 735 

Qy 725 TECTCLDTWRCSMGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYRHLTLIDLSNNR 784 

II IMIIIIIIII I: MlllhHIIMIIIIIIIIM :|| :|:| |:|||||: 

Db 736 qecacldtvvrcsnkhlralpkgipknvtelyldgnqftlvpgqlstfkylqlvdlsnnk 795 

Qy 785 ISTLSNQSFSNMTQLLTLILSINRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGAFNDL 844 
IIMII 11:11:11 IIMIII |:||!l I 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 : II I |: 
■ Db 796 isslsnssftnmsqlttlilsynalqcipplafqglrslrllslhgndistlqegifadv 855 

Qy 845 SALSHUIGANPLYCDCNMQWLSDWVKSEYREPGIARCAGPGEMADKLILTTPSRXFTCQ 904 

::lllilllllllllll:::HI III: llllllllllll : |||||||:||| || 
Db #856 tslshlaiganplycdchlrvlsswktgykepglarcagpqdmegklllttpakkfecq 915 

Qy 905 GPVDVNILAKC^PCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKH 964 
II : : III: IIIHIM lll::ll:: III II |:||:||:| :::| | I:: 
Wo 916 gpptlavqakcdlclsspcqnqgtchndplevyrcacpsgykgrdcevslnscssgpcen 975 

Qy 965 GGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEY 1024 

UN! Mil: I I I llll I II III |: | | MM: Mil I :| 
Db 976 ggtchaqegedapftcscptgfegptcgvntddcvdhacanggvcvdgvgnytcqcplqy 1035 

Qy 1025 TGELCEEKLDFCAQDLNPCQHDSKCILTPKGFRCDCTPGYVGEHCDIDFDDCQDNKCKNG 1084 

: II: :l h lllllll:::|: II I :|:| III |::| : l||:|::|:|| 
Db 1036 egkaceqlvdlcspdlnpcqheaqcvgtpdgprcecmpgyagdncsenqddcrdhrcqng 1095 

Qy 1085 AHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCL 1144 

I I I II MM Mill II I : h MM Hill! |: : I Mill 
Db 1096 aqcmdevnsysclcaegysgqlceipphlpapk-spcegtecqngancvdqgnrpvcqcl 1154 

Qy 1145 PGYQGERCERLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYRGDRDHIAV 1204 

M: I :IMI:|III:::::||| : |: |||||::| 1 1 : 1 1 1 1 1 I Mil 
Db 1155 pgfggpecekllsvnfvdrdtylqftdlqnwpranitlqvstaedngillyngdndhiav 1214 

Qy 1205 ELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLS 1264 

IIMI II III IIMHIMI HUM II I r I : I II ::||:|||:| : 
Db 1215 elyqghvrvsydpgsypssaiysaetindgqfhtvelvafdqmvnlsidggspmtmdnfg 1274 

Q'y 1265 KQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQRVPMQTG 1324 

I IM ::MIIIIII I I: I III 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 | : | 
Db 1275 khytlnseaplyvggmpvdvnsaafrlwqilngtgfhgcirnlyinnelqdftktqmkpg 1334 



Qy 1325 ILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGTCLPI 138' 

-llllll I'. I II III:: I I I: MM III : II |:||||| 1:1: 
Db 1335 wpgcepcrklyclhgicqpnatpgpmchceagwvglhcdqpadgpchghkcvhgqcvpl 139/ 

Qy 1385 NAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIRCKHGKCRLSGLGQPYCECSSGYTGDSCD 144' 

:| lllhl :j: I II:: I II: ::| I |: I :| I |::|: |; 
Db 1395 dalsyscqcqdgysgalcnqagalaepcrglqclhghcqasgtkgahcvcdpgfsgelce 145- 

Qy 1445 REISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSFECTD 150- 

:| III: :ll::l Mill III: :| Hill llllll ||||::||1H 
Db 1455 qesecrgdpvrdfhqvqrgyaicqttrplswvecrgscpgqgccqglrlkrrkftfecsd 151< 

Qy 1505 GSSFVDEVERWRCGCTRC 1523 

MM Mill, llll I 
Db 1515 gtsfaeevekptkcgcalc 1533 



RESULT 15 
Y04139 

ID Y04139 standard; Protein; 1534 AA. 
XX 

AC Y04139; 
XX 

DT 15-JUN-1999 (first entry) 

XX 

DE Human slit 1 protein, 

XX 

KW Huraan; slit-like protein; slit 3; slit 1; prevention; treatment; 

KW disease; spinal cord; thyroid gland; ovary; prostate; renal gland; 

KW small intestine; heart; trachea; thymus; lymph node; 

KW muscular system; colon, 

XX 

OS Homo sapiens, 

XX 

FH Key ; Location/Qualifiers 

FT Peptide 1..26 

FT • /label- signal 

FT Protein 27.. 1534 

FT . /label- slit 1 

XX 

PN JP11075846-A. • ■ 
XX 

PD 23-MAR-1999. - 
XX 

PF 02-SEP-1997; 97JP-0236994. 
XX 

PR 02-SEP-1997; 97JP-0236994. 
XX 

PA (AS AH ) ASAHI RASE I KOGYO KR, 
XX 

DR HPI; 1999-257695/22. 
DR N-PSDB; X19947.\ 
XX 

PT New slit-like polypeptide - useful for prevention and treatment of 
PT diseases in spinal cord, thyroid gland, ovary, prostate, renal 
PT gland, small intestine, heart, trachea, thymus, lymph node, muscular 
PT system and colon 

XX 

PS Disclosure; Page 42-46; 48pp; Japanese. 
XX ■ ■ 

CC The present sequence represents a human slit-like protein designated 

CC slit 1. Slit-like proteins can be used for the prevention and the 

CC treatment of diseases in spinal cord, thyroid gland, ovary, prostate, 

CC renal ^land, small intestine, heart, trachea, thymus, lymph node, 

CC muscular system and colon. 
XX 

SQ Sequence 1534 AA; 



Query Match ,' 67.2%; Score 5589; DB 20; 
Best Local Similarity 65,2%; Pred. No, 9.5e-295; 
Matches 991; Conservative 228; Mismatches 284 



Length 1534; 

Indels 16; Gaps 5; 
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oy 


15 


Db 


21 


Qy 


75 


Db 


81 


Qy 


135 


Db 


141 


Qy 


195 


Db 


201 


Qy 


255 




261 


• 


313 


Db 


318 


Qy 


373 


Db 


378 


Qy 


433 


Db 


438 


Qy 


485 


Db 


498 


Qy 


545 


Db 


556 


Qy 


605 


Db 


616 


' Qy 


665 




676 


d' 
f 


725 


Db 


736 


Qy 


785 


Db 


796 

t 


Qy 


845 


Db 


856 


Qy 


905 


Db 


916 


Qy 


965 


Db 


976 


Qy 


1025 


Db 


1036 



Mil 1:1:1:111111 I :::|:lll I III I h! II I II 1 1 I 



Mil: IMIIIII: :||||| 1 : 1 1 1 1 1 1 1 1 1 ! I I : III I II! 



iii illinium mini inmi.-iiiim iiiiiiiiiim : h 



MIMIMIMMMMMMIIMM 111:11 : hill II: III lllllll 



I II : I 



'SCSVL- 

1:1" 



HCPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLE 
III Mill llllllllll II 11111:11111 



I II lllllllll:lllllll!lll|:|:|imillllllllllllllll:ll: 



|::||||||IHNII:| llllll I |:l III III I :| ::| I III : llllhl 



1 1 1 f M 1 1 : 1 ' I i I ! : ! llllllllll lllllllllllllllllllll 



llll :l :|: II Mill 1 : 1 1 : II Mil III llllllllll 



IIIIMIII hill 1 1 1 1 : : : 1 1 :Tl 1 1 f 1 1 : MM IMMII::: Ihlh 



:IHII:|||:|: llll II : 1 1 II I II II 1 1 1 II : 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



II Mill MlhMMMIIM I M::||:|MI II |::| :: I I :| 



II MIMIMIM I: lllll|::|IMIIIIIIIIIII Ml MM IMMII 



MMM Ml:ll MMMI IMMI I 1 1 :l I III II I II II : II I I 



: M 1 1 1 1 1 1 1 1 1 1 1 1 1 ::: 1 1 1 Mh IIMIIIIIMI M IMIMIMM I 



III: Mhll: ll|::||:: Ml II IMIMIM ::M I II 



965 GGTCHLKEGEEDGFWCICADGFEGEKCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEY 

llll MM: | | | MM I II III I: I I MM: Mil II 



I: II: M I: IIIIIM:::|: II I MM Ml MM : MhhMM 



134 
140 
194 
200 
254 
260 
312 
317 
372 
377 
432 
437 
484 
497 
544 
555 
604 
615 
664 
675 
724 
735 
784 
795 
844 
855 
904 
915 
964 
975 
1024 
1035 



Qy 


1085 


Db 


1096 


Qy 


1145 


Db 


1155 


Qy 


1205 


Db 


1215 


Qy 


1265 


Db 


1275 


Qy 


1325 


Db 


1335 


Qy 


1385 


Db 


1395 


Qy 


1445 


Db 


1455 


Qy 


1505 


Db 


1515 



Mill l:M Mill II I : I: MM MMM M : I IMMI 



II: I : I M ! : II 1 1 : 



I: 1 1 1 II : : 1 MMMM II Mill 



Ih M Ml IE : I : II 1 1 1 llllll II IN I : : II : 1 1 1 : 1 : I 



I III :M!MMM I I: I III II II 1 1 II II f : 1 1 1 1 1 I I: I 



MMMI J I II MM: | | |: |:| Ml 



M MlhPM: I Ih: I Ih :M II h II M I hM: h 



M Mh :;l::l hill Mlh M Mill llllll MlhMIIM 



hll Mill llll I 
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GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd, 

j 

OM protein • protein search, using sw model 

Run on: January 22, 2001, 11:52:03 ; Search time 325.28 Seconds 

(without alignments) 
318.337 Million cell updates/sec 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



Irched: 



US-09-540-245A-2 

8316 ■ 

1 MRGVGWQMLSLSLGLVLAIL SSFVDEVEKWKCGCTRCVS 1525 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 
195891 seqs, 67900655 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0; 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 04 
Maximum Match 100* 
Listing first 45 summaries 



Database : 



PIR_66 :* 
pirl:* 
pirJ,:* 
pir3:* 
pir4:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than' or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



Result Query 

No. Score Match Length DB ID 



Description 



1 5728 

2 5578.5 

3 5530 

4 3486 

5 3475.5 

6 1318 

7 1115,5 

8 1006 

9 818 

10 768 

11 759 

12 757 

13 747.5 

14 745 

15 736 

16 732.5 
732 

18 731.5 

19 731 

20 731 

21 722 

22 719.5 

23 691.5 
660 

25 634.5 

26 625.5 

27 625,5 

28 606 

29 592 



17 



24 



1523 
1531 
1025 
1480 
1469 
333 
530 
601 
2703 
1220 
2524 
2352 
2531 
2531 
1203 
2437 
2471 
2555 
1064 
2531 
2318 
7 2321 
1964 



2 T13953 

2 T42218 

2 T42626 

2 A36665 

2 B36665 

2 T34555 

2 A31640 

2 T22025 

1 A24420 

2 A56136 
2 A35844 
2 T30201 
2 S18188 
2 A46019 
2 A49175 
2 S42612 
2 A49128 
2 A40043 
2 A40136 
2 T31070 
2 S45306 
2 S78549 
2 T09059 
2 A48825 
2 A35672 
2 A36666 
2 S16148 
2 S06434 
2 T13852 



MEGP5 protein ■ ra 
slit-1 protein hom 
secreted leucine-r 
slit protein 1 pre 
slit protein 2 pre 
hypothetical prote 
epidermal growth f 
hypothetical prote 
notch protein - fr 
jagged protein pre 
Xotch protein - Af 
Notch homolog prot 
notch protein homo 
Notch- 1 protein ■ 
Motch B protein • 
transmembrane prot 
cell -fate determin 
notch protein homo 
fibropellin la - s 
notch homolog - se 
notch 3 protein • 
notch3 protein ■ h 
notch4 - mouse 
Notch homolog Mote 
crumbs protein • f 
serrate protein pr 
gene serrate prote 
homeotic protein 1 
gene wheeler prote 



30 


586 


7,0 1385 


2 


T13887 


31 


568.5 


6,8 728 


2 


150719 


32 


549.5 


6,, 6 570 


2 


A48836 


33 


543 


6,5 1295 


2 


A32901 


34 


535.5 


6.4 473 


2 


A56175 


35 


523 


6,3 1372 


2 


T25933 


36 


518.5 


6,2 722 


2 


148324 


37 


493.5 


5,9 603 


2 


JC6128 


38 


491 


5.9 833 


2 


S19087 


39 


490 


5,9 1091 


2 


A58532 


40 


489 


5.9' 603 


2 


JC1282 


41 


484.5 


5.8 605 


2 


A41915 


42 


480 


5.8 832 


2 


A31246 


43 


479.5 


5.8 605 


2 


JC5239 


44 


478.5 


5.8 880 


2 


S00670 


45 


478 


5.7 3034 


2 


T14119 



tlr protein - frui 
C-Delta-1 - chicke 
fibropellin C prec 
glpl protein precu 
adhesive plaque pr 
hypothetical prote 
DELTA- like 1 - mou 
insulin-like growt 
gene Delta protein 
glial cell membran 
insulin-like growt 
insulin -like growt 
neurogenic protein 
insulin-like growt 
neurogenic repetit 
seven-pass transme 



RESULT 1 
T13953 

MEGF5 protein - rat 

N; Alternate names: slit protein homolog 
CjSpecies: Rattus norvegicus (Norway rat) 

C;Date: 20-Sep-1999 tsequencejrevision 20-Sep-1999 ttext.change 21-Jul-2000 
C; Accession: T13953 

R;Naxayama, M.; Nakajima, D.; Nagase, T.; Nomura, N.; Seki, N.; Ohara, 0. 
Genomics 51, 27-34, 1998 

A;Title: Identification of high-molecular -weight proteins with multiple EGF-like moti 
A;Reference number: Z14126; MUID : 98360089 
A;Accession: T13953 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A;Residues: 1-1523 <NAR> 

A; Cross -references: EMBL:AB011531; NID:g3449291; PIDN:BAA32461.1; PID:g3449292 
C; Genetics: 
A; Gene: MEGF5 



Query Match 68.9%; Score 5728; DB 2; Length 1523; 

Best Local Similarity 66,9%; Pred. No. 0; 

Matches 1015; Conservative 223; Mismatches 265; Indels 14; Gaps 

Qy 11 LSLGLVLA-ILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITR 69 

1:1 I II II: III :hll -llllll Ihlll llll Mil: Mill 
Db 16 LALALALASILSGPPAAACPTKCTCSAASVDCHGLGLRAVPRGIPRNAERLDLDRNNITR 75 

Qy 70 ITKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPELLFLGTAKL 129 

III II ll::llll I :h:| 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 : 1 II: ll!l! I II 
Db 76 ITKMDFTGLKULRVLHLEDNQVSVIERGAFQDLKQLERLRLNKNKLQVLPELLFQSTPKL 135 

Qy 130 YRLDLSENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITR 189 

nil in: Mini: nni i iiiiiiiiiiiiiiihiiiiiiiiii 

Db 136 TRLDLSENQIQGIPRKAFRGVTGVKNLQLDNNHISCIEDGAFRALRDLEILTLNNNNISR 195 

Qy 190 LSVASFNHMPSLRTFRLHSNNLYCDCHLAWLSDWLRRRPRVGLYTQCMGPSHLRGHNVAE 24S 

: I llllllhll llllhllllllllllllllhl :| : II I llll :||: 
Db 196 ILVTSFNHMPnRTLRLHSNHLYCDCHLAWLSDWLRQRRTIGQFTLCMAPVHLRGFSVAD 255 

Qy 250 VQKREFVCSDEEEGHQSFMAPSCSV-LHCPAACTCSNNIVDCRGKGLTEIPTNLPETIT 307 

II t : I : E I I I 11:1: I 1 1 : 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 llll I 
Db 256 VQKKEYVC — PGPHS- EAPACNANSLSCPSACSCSNNIVDCRGKGLTE IPANLPEG IV 310 

Qy 308 EIRLEQNTIKVIPPGAFSPYKRLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNKITEL 367 

1111111:11. II III llll:IH:| II 1 1 :: I i 1 1 1 1 1 1 : 1 1 lllllllllll: 
Db 311 EIRLEQNSIK3IPAGAFIQYKKLKRIDISKNQISDIAPDAFQGLKSLTSLVLYGNKITEI 370 

Qy 368 PKSLFEGLFSGQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQ 427 

II in IIIIIIIIMIIIIIII: llll llllllllllllllhM |:||::M 
Db 371 PKGLFDGLVSLQLLLLNANKINCLRVNTFQDLQNLNLLSLYDNKLQTISKGLFAPLQSIQ 430 
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| Db 



Db 



428 TMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKKFRCSGTED 487 

MIMIMMIIMIMMIM IIMMIMMIMIMIMI llllllllll|:|| 
431 TLHLAQNPFVCDCHLKWLADYLQDNPIETSGARCSSPRRLANKRISQIKSRKPRCSGSED 490 

488 YRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVLEAT 547 

lh: I :M II llllllllll llllllll::|| |:|:|| MMMM Mill 
491 YRNRFSSECFMDLVCPERCRCEGTIVDCSNQRLSRIPSHLPEYTTDLRLNDNDIAVLEAT 550 

548 GIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHRMFRGLESLKTL 607 

. Illllll lllll lh :: 1 1 1 1 : 1 1 : 1 ] |::|| Ml : :||:|| I! 

551 GIFKKLPNLRKINLSNNRIKEVREGAFDGAAGVQELMLTGNQLETMHGRMFRGLSGLKTL 610 

608 MLRSNRITCVGNDSPIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCY 667 

HIM MM 1 1 :L 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 :: 1 1 1 1 I! M II : 1 1 1 : II I M 1 1 : 
611 MLRSNLISCVNNDTpLSSVRLLSLYDNRITTISPGAFTTLVSLSTINLLSNPFNCNCH 670 

668 LAWLGEWLRKKRIVENPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDSSCSPLSRCPTEC 727 

Mill MIMMIiMMIMMIIMMMIIMIMM :||::||| III :| 
671 MAWLGRWLRKRRIVppRCQKPFFLKEIPIQDVAlQDFTC-EGNEENSCQLSPRCPEQC 729 

728 TCLDTWRCSNKGLmPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLIDLSNNRIST 787 

1 1 - 1 1 1 1 1 1 1 : 1 1 V 1 1 1 1 : 1 : 1 1 1 1 1 1 1 : 1 1 | MUM :: MIMIMI II 
730 TCVETWRCSNRGLB^LPKGMPKDVTELYLEGNHLTAVPKELSTFRQLTLIDLSNNSISM 789 

788 LSNQSFSNMTQLW|lLSYNRLRCIPPRTFDGLKSLRLLSLHGNDISVVPEGAFNDLSAL 847 

Ml MUM I ijlllllllllll 1 : 1 1 : 1 1 1 : 1 : 1 1 1 1 1 M MIMMIMM 
790 LTNHTFSNMSHLSTalLSYNRLRCIPVHAFNGLRSLRVLTLHGNDISSVPEGSFNDLTSL 8 

848 SHUIGANPLYCDCpWLSDWVRSEYKEPGIARCAGPGEMADRLLLTTPSKKFTCQGPV 907 
HUM MMMl{::MI:MM IIMMIIM I IIMIIIIM: :| Mill 

850 SHLALGINPLHCDCSLRWLSEWIRAGYKEPGIARCSSPESMADRLLLTTPTHRFQCKGPV 909 



Oy 908 DVNILAKCNPCLSNBCKNDGICNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGT 967 

MIMIMI IIMfllMIIM III: I M 1 1 1 1 MMII MM M MMMM 
Db 910 DINIVAKCNACLSSfKNNGTCSODPVEQYRCTCPYSYKGRDCTVPINTCVQNPCQHGGT 969 

Oy 968 CHLKEGEEDGFWCldADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGE 1027 

Ml I III I ( MIM IMI I II I J 1 1 1 f I r : M I M 1 1 1 1 Mill Mil 
Db 970 CHLSESHRDGFSCSCPLGFEGQRCEINPDDCEDNDCENSATCVDGINNYACVCPPNYTGE 1029 

Qy 1028 LCEEKLDFCAQDLNfcQHDSRCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGABC 1087 

IMI :MI ::ljllM:MI 1 1 1 : 1 : 1 III M J: I III :IM:|| I 
Db 1030 LCDEVIDYCVPEMN^CQHEAKCISLDKGFRCECVPGYSGKLCETDNDDCVAHKCRHGAQC 1089 

Qy 1088 TDAVNGYTCICPEGJSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGY 1147 

M 1 1 1 1 1 ! 1 1 1 : 1 L- Mill :MIII : : 1 1 1 1 1 1 ! 1 1 II Ml IM 

Db 1090 VDAVNGYTCICPQGBSGLFCEHPPPMVLLQTSPCDQYECONGAQCIWQQEPTCRCPPGF 1149 

Qy 1148 QGEKCEKLVSVNFlLsYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELY 1207 
§L I :|IM::MM ,MIM:: Illllll I M I M M M M 1 1 1 1 II 1 1 I :| :M I 

1150 AGPRCEKLITVNFVGRDSYVELASAKVRPQANISLQVATDKDNGILLYKGDNDPLALELY 1209 

Qy 1208 RGRVRASYDTGSHpjsAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPRIITNlSKQS 1267 

:l M IM I I : : 1 1 1 1 1 : 1 1 1 II MM MMMI Mill: III 
Db 1210 QGHVRLVYDSLSSPPTTVYSVETVNDGQFHSVELVMLNQTLNLWDKGAPRSLGKLQRQP 1269 

Qy 1268 TLNFDSPLYVGGMPGRSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQRVPMQT-GIL 1326 

: : 1 1 1 F : 1 1 : 1 : :::lll : lllll : IMIIIIM :| M M 
Db 1270 AVGINSPLYLGGIPTSIGLSALRQGADRPLGGFHGCIHEVRINNELQDFKALPPQSLGVS 1329 

Qy 1327 PGCEPCHRKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGMCVHGTCLPINA 1386 

MM I M II M : Ml II MUM MUM I IMM 
Db 1330 PGCKSC-TVCRHGLCRSVERDSWCECHPGWTGPLCDQEAQDPCLGHSCSHGTCV-ATG 1386 

Qy 1387 FSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSGYTGDSCDRE 1446 

II III IM I MM: | M I II IMI : Mill I |::|: MM 
Db 1387 NSYVCKCAEGYEGPLCDQRNDSANACSAFRCHHGQCHISDRGEPYCLCQPGFSGNHCEQE 1446 

Qy 1447 ISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSRRRRYSFECTDGS 1506 

I II :M ::M IMI I II : lllll I Ml MIMIMI MMMI 
Db 1447 NPCLGEIVREAIRRQKDYASCATASRVPIMVCRGGC-GSQCCQPIRSRRRRYVFQCTDGS 1505 

Qy 1507 SFVDEVEKWKCGCTRC 1523 



II 1 : 1 1 1 : "III I 
1506 SFVEEVERHLECGCREC 1522 
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T42218 

slit-1 protein homolog ■ rat 

N; Alternate names: MEGF4 protein 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 03-Dec-1999 tsequence_revision 03-Dec-1999 itext_change 21-M-2000 
C; Accession: T42218 ■ 

R;Nakayama, M,; Nakajima, D,; Nagase, T.; Nomura, N. ; Seki, N.; Ohara, O. 
Genomics 51, 27-34, 1998 

A; Title: Identification of high-molecular-weight proteins with multiple EGF-like moti 
A;Reference number: Z14126; MUID:98360089 
AjAccession: T42218 • 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: raRNA 

A; Residues: 1-1531 <NAK> 

A; Cross -references: IMBL:AB011530; NID:g3449289; PIDN:BAA32460.1; PID:g3449290 
A; Experimental source: strain Sprague-Dawley; brain 
C; Genetics: 
A; Gene: MEGF4 



Query Match 67.1%; Score 5578.5; DB 2; Length 1531; 

Best Local Similarity 65.0%; Pred, No. 0; 

Matches 988; Conservative 230; Mismatches 282; Indels 19; Gaps 

Qy 15 LVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITRITRTD 74 

Ml :: : IMI MMMIIIMI I 1 : 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 I I 
Db 21 LLWAAAWRLGATACPALCTCTGTTVDCBGTGLQAIPRNIPRNTERLELNGNNITRIHRND 80 

Qy 75 FAGLRHLRVLQLMENRISTIERGAFQDLRELERLRLNRNHLQLFPELLFLGTARLYRLDL 134 

MIM 1 1 1 1 1 1 M I - ( MMII MIMIIIMIM IM lllll I MM 
Db 81 FAGLRQLRVLQLMENQIGAVERGAFDDMKELERLRLNRNQLQVLPELLFQNNQALSRLDL 140 

Qy 135 SENQIQAIPRKAFRGAVDIRNLQLDYNQISCIEDGAFRALRDLEVLILNNNNITRLSVAS 194 

Ml MMIIMilll MIIIMI MIIMMMIIMI IIIIMIIIMI : Ml 

Db 141 SENSLQAVPRKAFRGATDLKNLQLDRNQISCIEEGAFRALRGLEVLTLNNNNITTIPVSS 200 

Qy 195 FNHMPRLRTFRLHSNNLYCDCHLAWLSDWLRRRPRVGLYTQCMGPSHLRGHNVAEVQKRE 254 

MlllllfllllllMMIIIMIIII IIMII : 1 1 : 1 1 1 IM III Illllll I 
Db 201 FNHMPRLRTFRLHSNHLFCDCHLAWLSQWLRQRPTIGLFTQCSGPASLRGLNVAEVQKSE 260 

Qy 255 FVCSDEEEGHQSFMAPSCSVL--HCPAACTCSNNIVDCRGRGLTEIPTNLPETITEIRLE 312 

I M : I I MM: Ml Mill I ! 1 1 1 1 ! 1 1 1 M IIIIMIIIMI 
Db 261 FSCSGQGEAAQ— VPACTLSSGSCPAMCSCSNGIVDCRGRGLTAIPANLPETMTEIRLE 317 

Qy 313 QNTIRVIPPG/iFSPYRRLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNKITELPKSLF 372 

I II 1 1 1 II I M I : M M 1 1 1 1 1 1 1 i : i : 1 1 1 1 1 1 1 1 11 1 1 1 M 1 1 M M 1 : 1 1 : :| 
Db 318 LNGIKSIPPGAFSPYRRLRRIDLSNNQIAEIAPDAFQGLRSLNSLVLYGNKITDLPRGVF 377 

Qy 373 EGLFSLQLLLIiNAKKINCLRVDAFQDLHNLNLLSLYDNRLQTIARGTFSPLRAIQTMHLA 432 

IMMIIMIIIIIMMI MMII IMIIMMIMMMIMM IIMIMIII 
Db 378 GGLYTLQLLLLNANKINCIRPDAFQDLQNLSLLSLYDNRIQSLAKGTFTSLRAIQTLHLA 437 

Qy 433 QNPFICDCHLKWLADYLHTNPIETSGARCTSPRRIANRRIGQIRSRKFRCS G 484 

1 1 1 1 1 1 M : 1 1 1 1 1 1 = I IMMMIIM IIIMMIMIMMMIMI I 
Db 438 QNPFICDCNLRWLADFLRTNPIETTGARCASPRRLANKRIGQIKSRKFRCSAREQYFIPG 497 

Qy 485 TEDYRSRLSGDCFADLACPERCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVL 544 

MM M :l MMII lllll : IMI 1 1 : ! II I III I HUH ::| 
Db 498 TEDYH-LNSECTSDVACPHKCRCEASWECSGLRLSKIPERIPQSTTELRLNNNEISIL 555 

Qy 545 EATGIFRKLPQLRRINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKGLESL 604 

1 1 : 1 : 1 1 1 1 IMM MM:::!!:: ||||: |:|: ||:|:||:|: ||:||: 
Db 556 EATGLFRKLSHLKKINLSNNRVSEIEDGTFEGATSVSELHLTANQLESVRSGMFRGLDGL 615 

Qy 605 KTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNC 664 

MMIMIIMM MM II MIIMIMI 1 1 1 : : 1 1 M 1 1 1 MIMIIIMIM! 
Db 616 RTLMLRNNRISCIHNDSFTGLRNVRLLSLYDNHITTISPGAFDTLQALSTLNLLANPFNC 675 
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Qy 


665 


Db 


676 


Qy 


725 


Db 


736 


Qy 


785 


Db 


796 


Qy 


845 


Db 


856 


Qy 


905 


w 


916 


Qy 


965 


Db 


976. 


Qy 


1025 


Db 


1036 


Qy 


1085 


■ Db 


1096 


Qy 


1145 


Db 


1152 


Qy 


1205 


Db 


1212 


Qy 


1265 


Db 


1272 


Qy 


1325 


A 

V 


1332 


Qy 


1385 


Db 


1392 


Qy 


1445 


' Db 


1452 


Qy 


1505 


Db 


1512 



II l!l!!:IHI::imillll I I! |::| :: I I :|| 



I M M 1 1 1 1 1 ! I: lllll|::lllllllllllllll :|| :|:| MINI 



Ihlll Ihlhll HUM! hill! I 1 1 :| 1 1 1 1 1 1 1 1 1 1 :| : I! I 



:IMIII!IIIIIII|:::||I III: 1 1 1 1 1 1 1 1 ! 1 1 1 || ! 1 1 1 1 1 1 : 1 1 1 || 



|||::||:: llllll hlh:|:| : :| III! : 



Mill :IM: II I I INI I :l II! Ml I lllll Ml || :| 



II II: Mil: 1 1 1 1 1 f 1 : : : I : MM MM III |:M : |||:|::|:|| 



111:11 hi lllll III III h MIM |: 



MM 



!h ! MIIIMIII: 



I: I M 1 1 : : I 1 1 fl 1 1 1 I! lllll 



MIM || Ml Ihhlllll HUH || |||: || ::||:|||: : I 



I III ::[IIIIHI I I: I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 h 1 1 1 1 1 I M I 



MIIIM I | || III:: | | |: || | I : II HUM |: 



:| :llhl,:h !■ II: 



II ::| I! h I 



III: MM:: |:|| Mil: M Mil I I II II Mil :|||:| 



1:11 Mill llll I 



RESUli 3 

T42626 «» 

secreted leucine-rich repeat-containing protein SLIT2 • mouse (fragment) 
N; Alternate names: neurogenic extracellular slit protein 
C;Species: Mus musculus (house mouse) 

C;Date: ll-Jan-2000 tsequencejrevision ll-Jan-2000 ttext_change ll-Jan-2000 
C; Accession: T42626 

R;Holmes, G.P.; Negus, K. ; Burridge, L.; Raman, S.; Algar, E.; Yamada, T.j Little, M, 
Mech. Dev. 79, 57-72, 1998 

A; Title: Distinct but overlapping expression patterns of two vertebrate slit homologs in 
A;Reference number: Z22177; MUID: 99279238 
A; Accession: T42626 



A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues:' 1-1025 <HOL> 

A;Cross-references: EMBL:AF074960; NID:g4151258; PID:g4151259; PIDN: AAD04345 .1 
C; Genetics: 
A; Gene: Slit2 



Query Match 66.5%; Score 5530; 

Best Local Similarity 95.91; Pred. No. 



3 2; Length 1025; 



Matches 


983; Conservative 25; Mismatches 17; Indels 0; Gaps 0 


Qy 


501 


ACPERCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVLEATGIFRRLPQLRKIN 


560 






llll III llllll III IIMII Ihlll III II III III I II llll lllll llllll II 




Db 


1 


ACPBKCRCEGTTVDCSNQRLNRIPDHIPQYTAELRLNMEFTVLEATGIFKKLPQLRXIN 


60 


Qy 


561 


FSNNKITDIEEGAFEGASGYNEILLTSNRLENVQHKMFKGLESLKTLMLRSNRITCVGND 


620 






1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 h 1 1 1 1 1 




Db 


61 


FSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKGLESLKTLMLRSNRISCVGND 


120 


Qy 


621 


sfiglssvrl'lslydnqittvapgafdtlhslsilelanpfncncylawlgewlrkkri 


680 






lllll lllllllllllllllllllll IIIIIMIIIIIIIIIIHIIIIIIH'III 

lllll 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 I I I l| I I I I I I I I I • \ I I I I I I 1 I ' I I I 




Db 


121 


sfiglgsvrllslydnqittvapgafdxlhslstlnllanpfncnchlawlgewlrrkri 


180 


Qy 


681 


vtgnprcqkpyflkeipiqdvaiqdftcddgnddnscsplsrcptectcldtwrcsnkg 


740 






llll llllllilMIMIIIMM IIIIIIMillllllllhhUIMI Hill 




Db 


181 


vtgnprcqrpyflreipiqdvaiqdftcddgnddnscsplsrcpsectcldtxvrcsnkg 


240 


Qy 


741 


lkvlpkgiprdvtelyldgnqfilvprelsnykhltlidlsnnristlsnqsfsnmtqll 


80& 






II II III 11:111 III III III II llll III III II III III III llll II III lllll 




Db 


241 


lkvlpkgipkdvtelyldgnqftlvprelsnykhltlidlsnnristlsnqxfsnmtqll 


300 


Qy 


801 


ililsynrlrcipprtfdglkslrllslhgndiswpegafndlsalshlaiganplycd 


860 






1 1 1 1 1 g 1 [ 1 1 1 i ) 1 1 1 1 1 j 1 1 1 1 1 1 1 1 ; 1 1 1 1 n M ! 1 1 1 1 ! 1 1 i 1 1 i 1 1 1 1 1 1 1 1 M 1 1 




Db 


301 


tlilsynrlr'cipprtfdglkslrllslhgndiswpegafndlsalshlaiganplycd 


360 


Qy 


861 


cnmqwlsdwvkseykepgiarcagpgemadklllttpskkftcqgpvdvnilakcnpcls 


920 






III llllll IIMIMIMIMI IMIIIMIMM IMM Ml MM 1 llllllll 




Db 


361 


cnmqwlsdpkseyrepgiarcagpgemadrlllttpsrrftcqgpmditiqarcnpcls 


420 


Qy 


921 


npcrndgtcnsdpvdfyrctcpygfrgqdcdvpihacisnpckhggtchlregeedgfwc 


980 






IIIIIIMIhllllllllllllllllllllllllllllllllMIIIIIIIII llll 




Db 


421 


npckndgtcnndpvdfyrctcpygfkgqdcdvpihacisnpckhggtchlkegenagfwc 


480 


Qy 


981 


icadgfegencevnvddcedndcennstcvdginnytclcppeytgelceerldfcaqdl 


1040 






MM1MII IIIIMMMIIII IIIMIMIIIIIMMIMIMIMIIII llllll 




Db 


481 


icadgfegencevniddcedndcennstcvdginnytclcppeytgelceerldfcaqdl 


540 


Qy 


1041 


npcqhdsrciltpkgfkcdcipgyvgehcdidfddcqdnrcrngahctdavngytcicpe 


1100 






lllll llllllll III III III II MINI! I'IM'IM III II MM II llll hill 




Db 


541 


npcqhdskciltpkgfkcdctpgyigehcdidfddcqdnkckngahctdavngytcvcpe 


600 


Qy 


1101 


gysglfcefs'ppmvlprtspcdnfdcqngaqcivrinepicqclpgyqgerceklvsvnf 


1160 






IlllllllllllllllllllllllllllllllhllHIIIIIIII! Illllllllll! 




Db 


601 


gysglfcefsppmvlprtspcdnfdcqngaqciirinepicqclpgylgerceklvsvnf 


660 


Qy 


1161 


inkesylqipsakvrpqtnitlqiatdedsgillyrgdrdhiavelyrgrvrasydtgsh 


1220 






MIIIIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 




Db 


661 


vnkesylqipsakvrpqtnitlqiatdedsgillyrgdrdhiavelyrgrvrasydtgsh 


720 


Qy 


1221 


pasaiysvetindgnfhivellaldqslslsvdggnpkiitnlskqstlnfdsplyvggm 


1280 






lllllllllljlllllllllll II IIIMIIIIMIMIMIMMMIMMMM! 




Db 


721 


pasaiysvetindgnfhivelltldsslslsvdggsprvitnlsrqstlnfdsplyvggm 


780 


Qy 


1281 


pgrsnvaslrqapgqngtsfhgcirnlyinselqdfqkvpmqtgilpgcepchrrvcahg 


1340 






llhlllllllllllllllllllllllllllllllhhlllllllllllllllllllll 




Db 


781 


pgknnvaslrqapgqngtsfhgcirnlyinselqdfrkmpmqtgilpgcepchkkvcahg 


840 


Qy 


1341 


tcqpssqagftcecqegwmgplcdqrtndpclgnkcvhgtclpinafsysckcleghggv 


1400 






IMMIMMIMMIMIMMMIMiMMMMIMMIMIMIMMMMi 




Db 


841 


mcqpssqsgftceceegwmgplcdqrtndpclgnkcvhgtclpinafsysckcleghggv 


900 


Qy 


1401 


lcdeeedlfnpcqairckhgrcrlsglgqpycecssgytgdscdreiscrgerirdyyqr 


1460 
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Db 901 LCDEEEDIFNPCQMIRCRHGRCRLSGVGQPYCECNSGFTGDSCDREISCRGERIRDYYQK 960 

Qy 1461 QQGYMCQTTKRVSRLECRGGCAGGQCCGPLRSRMYSFECTDGSSFTOEVERVVRCGC 1520 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIM 
Db 961 QQGYAACQTTKRVSRLECRGGCAGGQCCGPLRSKRRKYSFECTOGSSFVDEVEKWKCGC 1020 

Qy 1521 TRCVS 1525 
II I 

Db 1021 ARCAS 1025 



RESULT 4 
A36665 

slit protein 1 precursor • fruit fly (Drosophila melanogaster) 
, C; Species: Drosophila melanogaster 

- C;Date: 30-Apr-1991 »sequence_revision 30-Apr-1991 ttext_change 21-Jul-2000 

C'Accession; A36665; S13523 
* R;Rothberg, J.M.; Jacobs, J.R.; Goodman, C.S.; Artavanis-Tsakonas, S. 

Genes Dev. 4, 2169-2187, 1990 

A;Title: slit: an extracellular protein necessary for development of midline glia and cc 
A Reference number: A36665; MUID: 91099665 
^■Accession: A36665 
^; Status: preliminary 

A; Molecule type: mRNA 

A; Residues: 1-1480 <ROT> 

A; Cross -references: GB:X53959; NID:g8614; PIDN:CAA37910.1; PID:g8615 
C; Genetics: 
A;Gene: FlyBase:sli 

A;Cross-references: FlyBase:FBgn0003425 
, C;Superfamily: unassigned EGF-related proteins; EGF homology; leucine-rich alpha-2-glyc< 
j C; Keywords: alternative splicing 
! F;66-91/Domain: proteoglycan amino -terminal homology <PAHl> 
i F;101-124/Domain: leucine-rich alpha- 2 -glycoprotein repeat homology <LRR1> 
I F;125-148/Domain: leucine-rich alpha- 2 -glycoprotein repeat homology <LRR2> 

iF;149-172/Domain: leucine-rich alpha - 2 - glycoprotein repeat homology <LRR3> 
F;173-196/Domain: leucine-rich alpha - 2 -glycoprotein repeat homology <LRR4> 
F;197-220/Domain: leucine-rich alpha - 2 -glycoprotein repeat homology <LRR5> 
F;228-272/Domain: proteoglycan carboxyl-terminal homology <PCS1> 
F; 288-313/Doraain: proteoglycan amino-terminal homology <PAH2> 
' F;323-346/Domain: leucine-rich alpha - 2 - glycoprotein repeat homology <LRR6> 
F;347-370/Domain: leucine-rich alpha-2-glycoprotein repeat homology <LRR7> 
F; 371-394/Domain: leucine-rich alpha-2-glycoprotein repeat homology <LRR8> 
F;395-418/Domain: leucine-rich alpha -2 - glycoprotein repeat homology <LRR9> 
F;419-442/Domain: leucine-rich alpha-2-glycoprotein repeat homology <LR10> 
, F;450-494/Domain: proteoglycan carboxyl-terminal homology <PCS2> 
: F;512-537/Domain: proteoglycan amino-terminal homology <PAH3> 
] F;547-571/Domain: leucine-rich alpha-2-glycoprotein repeat homology <LR11> 
j F;572-595/Domain: leucine-rich alpha - 2 - glycoprotein repeat homology <LR12> 
«L596-619/Domain: leucine-rich alpha -2 -glycoprotein repeat homology <LR13> 
^■"u20-643/Domain: leucine-rich alpha -2 -glycoprotein repeat homology <LR14> 
W;651-695/Domain: proteoglycan carboxyl-terminal homology <PCS3> 
F;708-733/Domain: proteoglycan amino-terminal homology <PAH4> 
F;743-766/Domain: leucine-rich alpha - 2 -glycoprotein repeat homology <LR15> 
F;767-790/Domain: leucine-rich alpha - 2 - glycoprotein repeat homology <LR16> 
F;791-814/Domain; leucine-rich alpha -2 -glycoprotein repeat homology <LR17> 
F; 815-838/Domain: leucine-rich alpha -2 -glycoprotein repeat homology <LR18> 
F; 846-890/Domain: proteoglycan carboxyl-terminal homology <PCS4> 
F; 1028-1061/Domain : EGF homology <EGF> 
F;1068-1099/Domain: EGF homology <EGF2> 
F;1115-1148/Domain: EGF homology <EGF1> 



Query Match 41.9%; Score 3486; DB 2; Length 1480; 

Best Local Similarity 43.8%; Pred. No. 2.9e-196; 

Matches 660; Conservative 275; Mismatches 457; Indels 116; Gaps 22; 

Qy 1 28 CPAQCSCSGSTVDCHGLALRSVPRNIPRKTERLDLNGKNITRITKTDFAGLRHLRVLQLM 87 

II Ml:l \M I llll I : 111:1 llhl I :||| I 1 1 : 1 1 1 
Db 73 CPRVCSCTGLNVDCSHRGLTSVPRKISADVERLELQGNKLTVIYETDFQRLTRLRMLQLT 132 



oy 



3 ENKISTIERGAFQDLRELERLRLNRNHLQLFPELLFLGTARLYRLDLSENQIQAIPRRAF 147 

■\-\ llll :llll II llhl II : I: I 



133 DNQIHTIERNSFQDLVSLE- • 



- -RLDISNNVITTVGRRVF 168 



Db 

Qy 148 RGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVASFNHMPKLRTFRLH 207 

:H :::M'I l||:|::: ||: I : 1 1 : 1 1 1 1 1 1 1 : 1 I I : :|| II 
Db 169 RGAQSLRSLQLDNNQITCLDEHAFRGLVELEILTLNNNNLTSLPHNIFGGLGRLRALRLS 228 

Qy 208 SNNLYCDCHLAWLSDWLRRRPRVGLYTQCMGPSHLRGHNVAEVQRREFVCSDEEEGHQSF 267 

I MIIIMM :ll I: l|:.l II hi III:: :|| II I 

Db 229 DNPFACDCHLSWLSRFLRSATRLAPYTRCQSPSQLKGQNVADLHDQEFRCSGLTE 283 

Qy 268 MAP-SCSVLH-CPAACTCSNNIVDCRGRGLTEIPTNLPETnEIRLEQNTIRVIPPGAFS 325 

II I :.N I !:: Mill I II :| II: h:lllll I :ll :ll 

Db 284 HAPMECGAENSCPHPCRCADGIVDCRERSLTSVPVTLPDDTTDVRLEQNFITELPPRSFS 343 

Qy 326 PYKRLRRIDL'SNNQISELAPDAFQGLRSLNSLVLYGNKITELPRSLFEGLFSLQLLLLNA 385 

:::||||IHII II :| II lh I :|lllllll :|l :hll I h 1 1 1 1 1 1 
Db 344 SFRRLRRIDLSNNNISRIAHDALSGLKQLTTLVLYGNKIRDLPSGVFRGLGSLRLLLLNA 403 

Qy 386 NKINCLRVDAFQDLHNLNLLSLYDNRLQTIARGTFSPLRAIQTMHLAQNPFICDCHLKWL 445 

hhhl ll'hllhhlllilll :|::| III :::::|:|||:||||||h|:|| 
Db 404 NEISCIRRDAFRDLHSLSLLSLYDNNIQSLANGTFDAMRSMKTVHLAKNPFICDCNLRWL 463 

Qy 446 ADYLHTNPIElisGARCTSPRRLANRRIGQIRSRRFRCSGTEDYRSKLSGDCFADLACPER 505 

MM MIMHIII Ihh :|| :: :||:|| I I lllhl I II 
Db 464 ADYLHRNPIETSGARCESPKRMHRRRIESLREERFKCSWGE-LRMKLSGECRMDSDCPAM 522 

Qy 506 CRCEGTTVDCSNQRLNRIPEHIPQYTAELRLNNNEFTVLEATGIFKKLPQLRKINFSNNK 565 

I Ml 1 1 1! I : ::| :|| II :| II Ihll : : hi :|| I M M 

Db 523 CHCEGTTVDCTGRRLKEIPRDIPLHTTELLLNDNELGRISSDGLFGRLPHLVRLELRRNQ 582 

Qy 566 ITDIEEGAFEGASGVNEILLTSNRLEWQHRMFRGLESLRTLMLRSNRITCVGNDSFIGL 625 

:| II MINI : h I I::: : MM II III 
Db 583 LTGIEPNAFEGASHIQELQLGENRIREISNRMFLGLHQLRT 623 

Qy 626 SSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCYLAWLGEWLRRRRIVTGNP 685 

1 : 1 1 1 ! 1 1 : | ||:|: hlh:IH MIMIMMM I Ml : I 
Db 624 LNLYDNQISCVMPGSFEHLNSLTSLNLASNPFKCNCHLAWFAECVRRRSLNGGAA 678 

Qy 686 RCQRPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCPTECTCLDTWRCSNKGLRVLP 745 

II I :::: hh Mil: II III Ml II II M 

Db 679 RCGAPSKVRDVQIRDLPHSEFRCSSENSE-GCLGDGYCPPSCTCIGTWACSRNQLKEIP 737 

Qy 746 RGIPRDVTELYLDGNQFTLVPRE-LSNYKHLTLIDLSHNRISTLSNQSFSNMTQLLTLIL 804 

MM : MMM M : I : : : II MIIIMM III MMMM |||: 
Db 738 RGIPAETSELYLESNEIEQIHYERIRHLRSLTRLDLSNNQITILSNYTFANLTKLSTLII 797 

Qy 805 SYNRLRCIPPRTFDGLRSLRLLSLHGNDISWPEGAFNDLSALSHLAIGANPLYCDCNMQ 864 

llhhl: . II MMMIIM IMMMM II :|:|:|:|:||||||| :: 
Db 798 SYNRLQCLQRHALSGLNNLRWSLHGNRISMLPEGSFEDLKSLTHIALGSNPLYCDCGLK 857 

Qy 865 WLSDWVRSEYREPGIARCAGPGEMADKLLLTTPSKRFTCQGPVDVNILARCNPCLSNPCK 924 

I llhl M llllllll I :| 111:1:111 I hi I MMIM I lh 
Db 858 WFSDWIRLDYVEPGIARCAEPEQMRDKLILSTPSSSFVCRGRVRNDILARCNACFEQPCQ 917 

Qy 925 NDGTCNSDPVDFYRCTCPYGFRGQXDVPIHACISNPCKHGGTCHLREGEEDGFWCICAD 984 

I hi I : I I I : I : I : I II llh: II : II I I II 
Db 918 NQAQCVALPQREYQCLCQPGYHGRHCEFMIDACYGNPCRNNATCTVL--EEGRFSCQCAP 975 

Qy 985 GFEGENCEVNVDDC-EDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFCAQDLNPC 1043 

I: I II hill : hlhlhlh MM ::|| |: |: ||: : III 
Db 976 GYTGARCETNIDDCLGEIRCQNNATCIDGVESYKCECQPGFSGEFCDTKIQFCSPEFNPC 1035 

Qy 1044 QHDSRCILTPRGFRCDCTPGYVGEHCDIDFDDCQDNRCRNGAHCTDAVNGYTCICPEGYS 1103 

: MM M : ||| h I :| : IMM: Mil I I :| I I lh h 
Db 1036 ANGARCMDHFTHYSCDCQAGFHGTNCTDNIDDCQHHMCQNGGTCVDGINDYQCRCPDDYT 1095 

Qy 1104 GLFCEFSP- -PMVLPRTSPCDNFDCQNGA- -QCIVRINEPICQCLPGYQGEKCERLVSVN 1159 

I :|| M Mill j :|::| I : :: MM IN |: II I |:: 
Db 1096 GRYCEGHNMISMMYPQTSPCQNHECKHGVCFQPNAQGSDYLCRCHPGYTGKWCEYLTSIS 1155 

Qy 1160 FINRESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELYRGRVRASYDTGS 1219 

h: h::: : lh hh :: I : 1 1 1 : 1 I hlllh 1 1 : 1 III h 
Db 1156 FVHNNSFVELEPLRTRPEANVTIVFSSAEQNGILMYDGQDAHLAVELFNGRIRVSYDVGN 1215 
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Qy 1220 HPASAI YSVETINDGNFHI VELLALDQSLSLSVDGGNPKI IT NLSKQSTLNFDSPLYVGG 1279 

II MM; || :| UNI: :: :| II I : I I I :|:::|| 

Db 1216 HPVSTMYSFEMVADGKYHAVELLAIRKNFTLRVDRGLARSIINEGSNDYLKLTTPMFLGG 1275 

Qy 1280 MPGKSNVASLROAPGQNGTSFHGCIRNLYINSELQDFQKVPMQTGILPGCEPCHKKVCAH 1339 

:l : : :l III lh: -II :| II II III 

Db 1276 LPVDPAQQAYKNWQIRNLTSFKGCMKEVWINHKLVDFGNAQRQQKITPGC ALLE 1329 

Qy 1340 G1CQPSSQAGFICECQEGWMG-PLCDQRTNDPCLGNKCVHGT-CLP-INA-FSYSCKCL 1394 

M : :: :| I : Ml! ||| |: |:| | I III 

Db 1330 GEQQEEE DDEQDFMDETPHIKEEPVDPCLENKCRRGSRCVPNSNARDGYQCKCK 1383 



Qy 1395 EGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSGYTGDSCDREISCRGERI 1454 

I I II: I I I :| :|| |:: 

Db 1384 HGQRGRYCDQGEGSTEP PTVTAAS TCRKEQV 1414' 



1455 RDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSFECTDGSSFVDEVEK 1514 

IHI : I:: : : :| III I III :||| I:: :: :: 
1415 REYYTEND — CRSRQPLKYAKCVGGC-GNQCCAAKIVRRRKVRMVCSNNRKYIKNLDI 1469 



Qy 1515 WKCGCTR 1522 

I Hill: 
Db 1470 VRKCGCTK 1477 



RESULT 5 
B36665 

. slit protein 2 precursor - fruit fly (Drosophila melanogaster) 

C; Species: Drosophila melanogaster 
1 C;Date: 3Q-Apr-1991 lsequence_revision 30-Apr-1991 ttext.change 19-May-2000 

C;Accession: B36665 

R;Rothberg, J.M.; Jacobs, J.R.; Goodman, C.S.; Artavanis-Tsakonas, s. 
Genes Dev. 4, 2169-2187, 1990 
A; Title: slit: an extracellular protein necessary for development of midline glia and cc 
A;Reference number: A36665; MUID: 91099665 
A; Accession: B36665 
A; Status: preliminary 
A;Molecule type: mRNA 
A;Residues: 1-1469 <ROT> 
A;Cross-references: GB:X53959 
C; Genetics: 
A;Gene: FlyBase:sli 

A;Cross-references: FlyBase:FBgn0003425 
C; Super family: unassigned EGF-related proteins; EGF homology; leucine-rich alpha-2-glycc 

«66-91/Domain: proteoglycan amino-terminal homology <PAH1> 
101-124/Domain: leucine-rich alpha- 2 -glycoprotein repeat homology <LRR1> 
125-148/Domain: leucine-rich alpha-2-glycoprotein repeat homology <LRR2> 
F;149-172/Domain: leucine-rich alpha-2-glycoprotein repeat homology <LRR3> 
F;173-196/Domain: leucine-rich alpha-2-glycoprotein repeat homology <LRR4> 
F;197-220/Domain: leucine-rich alpha-2-glycoprotein repeat homology <LRR5> 
F;228-272/Domain: proteoglycan carboxyl- terminal homology <PCSl> 
F;288-313/Domain: proteoglycan amino-terminal homology <PAH2> 
F;323-346/Domain: leucine-rich alpha -2 -glycoprotein repeat homology <LRR6> 
F;347-370/Domain: leucine-rich alpha -2 -glycoprotein repeat homology <LKR7> 
F;371-394/Domain: leucine-rich alpha- 2 -glycoprotein repeat homology <LRR8> 
F;395-418/Domain: leucine-rich alpha-2-glycoprotein repeat homology <LRR9> 
F;419-442/Domain: leucine-rich alpha - 2 -glycoprotein repeat homology <LR10> 
F;450-494/Domain: proteoglycan carboxyl -terminal homology <PCS2> 
F,-512-537/Domain: proteoglycan amino-terminal homology <PAH3> 
F;547-571/Domain: leucine-rich alpha - 2 - gl ycoprote in repeat homology <LRll> 
F;572-595/Domain: leucine-rich alpha-2-glycoprotein repeat homology <LR12> 
i F;596-619/Domain: leucine-rich alpha-2-glycoprotein repeat homology <LR13> 
F;620-643/Domain: leucine-rich alpha - 2 - glycoprote in repeat homology <LR14> 
F;651-695/Domain: proteoglycan carboxyl -terminal homology <PCS3> 
F;708-733/Domain: proteoglycan amino-terminal homology <PAH4> 
F;743-766/Domain: leucine-rich a lpha - 2 - g ly copr ote in repeat homology <LR15> 
F;767-790/Domain; leucine-rich alpha-2-glycoprotein repeat homology <LR16> 
F;846-890/Domain: proteoglycan carboxyl -terminal homology <PCS4> 
F;1028-1061/Domain: EGF homology <EGF> 
F;1068-1099/Domain: EGF homology <EGF2> 
F;1115-1148/Domain: EGF homology <EGF1> 



Query Match 41.8%; Score 3475,5; DB 2; Length 1469; 

Best Local Similarity 43.5*; Pred. No. 1.2e-195; 

Matches 656; Conservative 274; Mismatches 451; Indels 127; Gaps 

Qy 28 CPAQCSCSGST.VDCHGLALRSVPRNIPRNTERLDLNGNNITRITKTDFAGLRHLRVLQLM 87 

II MM -I! I llll I : |||:| Ml: | :||| | ||:||| 
Db 73 CPRVCSCTGLSVDCSHRGLTSVPRKISADVERLELQGNNLTVIYETDFQRLTKLRMLQLT 132 

Qy 88 ENKISTIERGAFQDLKELERLRLNRNHLQLFPELLFLGTAKLYRLDLSENQIQAIPRKAF 147 

:|:| llll :llll II |||:| I I : |: I 

Db 133 DNQIHTIERNSFQDLVSLE RLDISNNVITTVGRRVF 168 

Qy 148 RGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVASFNHMPKLRTFRLH 207 

HI :::||:H IIIH::: lh I HIHIIIIIIH I I : HI II 
Db 169 KGAQSLRSLQLDNNQITCLDEHAFKGLVELEILTLNNNNLTSLPHNIFGGLGRLRALRLS 228 

Qy 208 SNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPSHLRGHNVAEVQKREFVCSDEEEGHQSF 267 

I llllh'lll HI I: IIH II IH III:: HI II I 

Db 229 DNPFACDCHLSWLSRFLRSATRLAPYTRCQSPSQLKGQNVADLHDQEFKCSGLTE 283 

Qy 268 MAP-SCSVLH-'CPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLEQNTIKVIPPGAFS 325 

II I : Jl I h: Mill I II H II: IMIIII I HI HI 

Db 284 HAPMECGAENSCPHPCRCADGIVDCREKSLTSVPVTLPDDTTDVRLEQNFITELPPKSFS 343 

Qy 326 PYKKLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNKITELPKSLFEGLFSLQLLLLNA 385 

"HIIIIIIII II H II l|: I HIIIIIII HI HHI IIHIIIII 
Db 344 SFRRLRRIDLSNNNISRIAHDALSGLKQLTTLVLYGNKIKDLPSGVFKGLGSLRLLLLNA 403 

Qy 386 NKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLAQNPFICDCHLKWL 445 

IHHH llhllhhlllllll H::| III : : : : H H 1 1 H 1 1 1 1 1 1 H H I 
Db 404 NEISCIRKDAFRDLHSLSLLSLYDNNIQSLANGTFDAMKSMKTVHLAKNPFICDCNLRWL 463 

Qy 446 ADYLHTSPIETSGARCTSPRRLANKRIGQIKSKKFRCSGTEDYRSKLSGDCFADLACPEK 505 

lllll lllll'lllll IIH: HI :: HIHI I I IIIIH I II 
Db 464 ADYLHKNPIErSGARCESPKRMHRRRIESLREEKFKCSWGE-LRMKLSGECRMDSDCPAM 522 

Qy 506 CRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVLEATGIFKKLPQLRKINFSNNK 565 

I IHIIH:. :H HI II H II IIHI : : IH HI I I: |: 

Db 523 CHCEGTTVDCTGRRLKEIPRDIPLHTTELLLNDNELGRISSDGLFGRLPHLVKLELKRNQ 582 

Qy 566 ITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKGLESLKTLMLRSNRITCVGNDSFIGL 625 

H || HUH : |: | ::: : :||| || I 
Db 583 LTGIEPNAFE-3ASHIQELQLGENKIKEISNKMFLGLHQLKT 623 

Qy 626 SSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCYLAWLGEWLRKKRIVTGNP 685 

IHIIHI: I IIH: |:||::||| HIIIIIIHII I HII : I 
Db 624 LNLYDNQISCVKPGSFEHLNSLTSLNLASNPFNCNCHLAWFAECVRKKSLNGGAA 678 

Qy 686 RCQKPYFLKEiPIQDVAIQDFTCDDGNDDNSCSPLSRCPTECTCLDTWRCSNKGLKVLP 745 

II I IH: :|| I: I II III III II II :| 

Db 679 RCGAPSKVRDVQIRDLPHSEFKCSSENSE-GCLGDGYCPPSCTCTGTWACSRNQLKEIP 737 

Qy 746 KGIPRDVTELYLDGNQFTLVPKE-LSNYKHLTLIDLSNNRISTLSNQSFSNMTQLLTLIL 804 

:!!! : HHI: |: : | : : : || H||||:|: III -A'A^-A III: 
Db 738 RGIPAETSELYLESNEIEQIHYERIRHLRSLTRLDLSNNQITILSNYTFANLTKLSTLII 797 

Qy 805 SYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGAFNDLSALSHLAIGANPLYCDCNMQ 864 

IIIHH: ' |] : 1 1 : : 1 1 M I IhHIIH II : 1 : 1 : 1 : ! : 1 1 1 1 1 1 1 :: 
Db 798 SYNKLQCLQRHALSGLNNLRWSLHGNRISMLPEGSFEDLKSLTHIALGSNPLYCDCGLK 857 

Qy 865 WLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQGPVDVNILAKCNPCLSNPCK 924 

I IIIH HHIIIIIII I H IHHJ I IH I HIIMI I lh 
Db 858 WFSDWIKLDYVEPGIARCAEPEQMKDKLILSTPSSSFVCRGRVRNDILAKCNACFEQPCQ 917 

Qy 925 NDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGTCHLKEGEEDGFWCICAD 984 

I I : I ' IH I I: I: I: I II III:: II : II I I II 
Db 918 NQAQCVALPQREYQCLCQPGYHGKHCEFMIDACYGNPCRNNATCTVL--EEGRFSCQCAP 975 

Qy' 985 GFEGENCEVNVDDC-EDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFCAQDLNPC 1043 

I: I II 1:111 : IHIHIHI: HIM ::|| I: I: M : III 
Db 976 GYTGARCETN|DDCLGEIKCQNNATCIDGVESYKCECQPGFSGEFCDTKIQFCSPEFNPC 1035 
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Qy 1044 QHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCICPEGYS 1103 

: :lh : III h I :| : Mil:: 1 : 1 1 I I :| I I II: I: 
Db 1036 ANGAKCMDHFTHYSCDCQAGFHGTNCTDNIDDCQNHMCQNGGICVDGINDYQCRCPDDYT 1095 

Qy 1104 GLFCEFSP- -PMVLPRTSPCDNFDCQNGA- -QCIVRINEPICQCLPGYQGERCEKLVSVN 1159 

I :H I: Mill I :|::| I : :: :|:| II |: II I |:: 

Db 1096 GKYCEGHNMISMMYPQTSPCQNHECKHGVCFQPNAQGSDYLCRCHPGYTGSWCEYLTSIS 1155 

Qy 1160 FINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELYRGRVRASYDTGS 1219 

I:: I:::: : II: hh :: I :ll|:| I hllll: l|:| III |: 
Db 1156 FVHNNSFVELEPLRTRPEANVTIVFSSAEQNGILMYDGQDAHLAVELFNGRIRVSYDVGN 1215 

Qy 1220 HPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLSKQSTLNFDSPLYVGG 1279 

II I :H I : II :l Mill: :: :| II I : I I I :|:::|| 

Db 1216 HPVSTMYSFEMVADGKYHAVELLAIKKNFTLRVDRGLARSIINEGSNDYLKLTTPMFLGG 1275 

Qy 1280 MPGKSNVASLRQAPGQNGTSFHGCIRNLYISSELQDFQKVPMQTGILPGCEPCHKKVCAH 1339 

:| : : :l III II:: ::ll :| II II III 
Db 1276 LPVDPAQQAYRNWQIRNLTSFKGCMREVWINHKLVDFGNAQRQQKITPGC ALLE 1329 



i 



)y 1340 GTCQPSSQAGFTCECQEGWMG-PLCDQRTNDPCLGNKCVHGT-CLP-INA-FSYSCKCL 1394 
II : :: :| I : II III |: |:| II I III 

1330 GEQQEEE DDEQDFMDETPHIKEEPVDPCLENKCRRGSRCVPNSNARDGYQCKCK 1383 



1395 EGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSGYTGDSCDREISCRGERI > 1454 

I I II: :li h:. 

1384 HGQRGRYCDQAAS TCRKEQV'1403 

1455 RDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSFECTDGSSFVDEVER 1514 

hll : I:: : : :| III I III :||| I:: :: :: 
1404 REYYTEND — CRSRQPLKYAKCVGGC - GNQCCAAKIVRRRKVRMVCSNNRKYIKNLDI 1458 
I 

1515 WKCGCTR 1522 
I Mill: * 
1459 VRKCGCTK 1466 



RESULT 6 
T34555 

hypothetical protein DKFZp434N0435.1 - human 
CjSpecies: Homo sapiens (man) 

C;Date: 29-Oct-1999 *sequence_revision 29-Oct-1999 #textLChange 29-Oct-1999 
C;Accession: T34555 

R;Poustka, A,; Wellenreuther, R.; Mewes, H.W.; Gassenhuber, J,; Wiemann, S. 

submitted to the Protein Sequence Database, October 1999 

A; Reference number: Z21540 

A; Accession: T34555 

A; Status: preliminary 

A; Molecule type: mRNA 

» Residues: 1-333 <POtJ> 
Cross -references: EMBL:AL122074 
.Experimental source: adult testis; clone DKFZp434N0435 
C; Genetics: 

A; Note: DKFZp434N0435.1 



Query Match 15.8%; Score 1318; DB 2; Length 333; 

Best Local Similarity 76.1%; Pred. no. 4.3e-70; 

Matches 245; Conservative 37; Mismatches 40; Indels 0; Gaps 

Qy 334 DLSNNQISELAPDAFQGLRSLNSLVLYGNKITELPKSLFEGLFSLQLLLLNANKINCLRV 393 

I: llll::|||||ll|:|| lllllllllll: I Ihll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
Db 12 DISKNQISDIAPDAFQGLKSLTSLVLYGNKITEIAKGLFDGLVSLQLLLLNANKINCLRV 71 

Qy 394 DAFQDLHNLNLLSLYDNRLQTIAKGTFSPLRAIQTMHLAQNPFICDCHLKWLADYLHTNP 453 

: Mil 111111111111111:11 I:||::|l|:|llllll:llllllllllll II 
Db 72 mFQDI^NLNLLSLYDNKLQTISKGLFAPLQSIQTLHLAQNPFVCDCHLKWLADYLQDNP 131 

Qy 454 IETSGARCTSPRRLANKRIGQIKSKKFRCSGTEDYRSKLSGDCFADLACPEKCRCEGITV 513 

lllllllhllllllllll lllllllllhlllll: I :|| || IHIIIIIII | 
Db 132 IETSGARCSSPRRLANKRISQIRSRKFRCSGSEDYRSRFSSECFMDLVCPEKCRCEGTIV 191 

Qy 514 DCSNQRLNRIPEHIPQYTAELRLNNNEFTVLEATGIFRRLPQLRKINFSNMRITDIEEGA 573 



Mill! :|J |:|:| :||||:|| : 1 1 1 1 1 1 1 1 1 1 1 1 Mill Mill :: III 
Db 192 DCSNQKLVRIPSHLPEYVTDLRLNDNEVSVLEATGIFRKLPNLRKINLSNNRIKEVREGA 251 

Qy 574 FEGASGVNEILLTSNRLENVQHRMFKGLESLKTLMLRSNRITCVGNDSFIGLSSVRLLSL 633 

1:11: I l::M Mil I ::MII M 1 1 1 1 1 1 1 I II IMI IMMIIMI 
Db 252 FDGAASVQELMLTGNQLETVHGRVFRGLSGLKTLMLRSNLIGCVSNDTFAGLSSVRLLSL 311 

Qy 634 YDNQITTVAPGAFDTLHSLSTL 655 

llhllM MM II IIIM 
Db 312 YDNRITTITPGAFTTLVSLSTM 333 



RESULT 7 
A31640 

epidermal growth factor-like protein slit - fruit fly (Drosophila melanogaster) (free 
C; Species: Drosophila melanogaster 

C;Date: 28-Feb-1990 fsequence_revision 28-Feb-1990 Stext.change ll-Jan-2000 
CjAccession: A31640 ' 

R;Rothberg, J.M.; Hartley, D.A,; Walther, Z.; Artavanis-Tsakonas, S. 
Cell 55, 1047-1059, 1988 

A; Title: slit: An EGF-homologous locus of D. melanogaster involved in the development 
A;Reference number: A31640; MUID: 89077533 
Accession: A31640 ' 
A;Molecule type: DNA 
A;Residues: 1-530 <ROT> 

AjCross-references: GB:M23543; NID:g340939; PID:g514357 
C; Genetics: 
A;Gene: FlyBase:sli 

A; Cross-references : FlyBase : FBgnO 0034 25 
A;lntrons: 470/3 

C; Super family: unassigned EGF-related proteins; EGF homology 
C; Keywords: growth factor 
F;148-181/Domain: EGF homology <EGF> 
F;188-219/Domain: EGF homology <EGF2> 
F;235-268/Doiain: EGF homology <EGF1> 



Query Match 13.4%; Score 1115,5; DB 2; Length 530; 

Best Local Similarity 39.01; Pred. No. 5.5e-58; 

Matches 208; Conservative 102; Mismatches 199; Indels 25; Gaps 11; 

Qy 888 MADKLLLTTPKKKFTCQGPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKG 947 

I IIMhlll I Ml I :llllll I IMI I : I Ml I M I 
Db 1 MKDRLILSTPSSSFVCRGRVRNDILAKCNACFEQPCQNQAQCVALPQREYQCLCQPGYHG 60 

Qy 948 QDCDVPIHACISNPCKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDC * EDNDCENN 1006 

: M I II'. IIM: II : III I II M I II Mill : Mil 
Db 61 RHCEFMIDACYGNPCRNNATCTVL--EEGRFSCQCAPGYTGARCETNIDDCLGEIKCQNN 118 

Qy 1007 STCVDG INNYTCLCPPE YTGELC EEKLDFCAQDLNPCQHDSKC I LT PKG FKC DCT PG YVG 1066 

: 1 1 : 1 1 : ' : I - 1 I I ::ll M M IM : III : :IM : III M I 
Db 119 ATCIDGVESYKCECQPGFSGEFCDTKIQFCSPEFNPCANGAKCMDHFTHYSCDCQAGFHG 178 

Qy 1067 EHCDIDFDDCQDNKCKNGAHCTDAVNGYTCICPEGYSGLFCEFSP--PMVLPRTSPCDNF 1124 

:| : lllf:: Mil II :| I I IM Ml HI M Mllll I 
Db 179 TNCTDNIDDCQNHMCQNGGTCVDGINDYQCRCPDDYTGKYCEGHNMISMMYPQTSPCQNH 238 

Qy 1125 DCQNGA--QCIVRINEPICQCLPGYQGEKCERLVSVNFINRESYLQIPSAKVRPQTNITL 1182 
:M:| I : :: :|:| III M II I MM:: I:::: : IM MM 
239 ECKHGVCFQPNAQGSDYLCRCHPGYTGKWCEYLTSISFVHNNSFVELEPLRTRPEANVTI 298 



1183 QIATDEDSGILLYKGDRDHIAVELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELL 1242 

:: : :lll:l I MIIIM IMI III Mil Mil: II :l IMI 
299 VFSSGQ-NGIIiMYDGQDAHLAVELFNGRIRVSYDVGNHPVSTMYSFEMVADGKYHAVELL 357 

1243 ALDQSLSLSVDGGNPKIITNLSRQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHG 1302 

M :: :| II I : I I :|:::IMI : : :l III I 

358 AIKKNFTLRVDRGLARSIIHEGSNDYLKLTTPMFLGGLPVDPAQQAYKNWQIRNLTSFKG 417 

1303 CIRNLYINSELQDFQKVPMQTGILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMG ■ - 1360 

M: ::ll Ml I I III I I : :: :| 

418 CMKEWINHKIVDFGNAQRQQKITPGC ALLEGEQQEEE DDEQDFMDET 465 
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Qy 1361 PLCDQRTNDPCLGNKCVHGT-CLP-INA-FSYSCKCLEGHGGVLCDEEEDLFNP 1411 

I : INI III I: hi II I III I I II: I I 
Db 466 PHIKEEPTOPCLENKCRRGSRCVPNSNARDGYQCKCRHGQRGRYCDQGEGSTEP 519 



RESULT 8 
T22025 

hypothetical protein F40E10.4 - Caenorhabditis elegans (fragment) 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 tsequencejrevision 15-Oct-1999 -Itext.change 18-Peb-2000 
C; Accession: T22025 
R;Smye, R. 

submitted to the EMBL Data Library, February 1996 
A; Reference number: Z19503 
A.-Accession: T22025 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 

Csidues: 1-601 <WIL> 
OSS-references: EMBL:Z69792; PIDN:CAA93668.1; GSPDB:GN00028; CESP:F40E10,4 
perimental source: clone F40E10 
C; Genetics: 
A;Gene: CESP:F40E10.4 
A; Map position: X 



Query Match 12.1%; Score 1006; DB 2; Length 601; 

Best Local Similarity 31.3%; Pred. No. 1.7e-51; 

Matches 212; Conservative 109; Mismatches 243; Indels 114; Gaps 17; 

Qy 870 VKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQGPVDVNILAKCNPCLSNPCKNDGIC 929 

:lh: I lllll I ::::llll :lll I : l|: ll::||||: I 
Db 1 IKSKFIEAGIARCEYPNTVSNQLLLTAQPYQFTCDSKVPTKLATRCDLCLNSPCKNNAIC 60 

Qy 930 NSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCRHGGTCHLKEGEEDGFWCICADGFEGE 989 

: I I I II I I: I II ill : II I : ; II I III: 
Db 61 ETTSSRKYTCNCTPGFYGVHCENQIDACYGSPCLNNATC - - KVAQAGRFNCYCNKGFEGD 118 

Qy 990 NCEVNVDDCEDNDCENNSTCVD GINNYICLCPPEYTGELCEEKLD 1034 

II hill :: III III Ihl I II II h 11:11: 

Db 119 YCEKNIDDCVNSKCENGGKCVDLVRFCSEELKNFQSFQINSYRCDCPMEYEGKHCEDKLE 178 

Qy 1035 FCAQDLNPCQHDSRCILTPRGFRCDCTPGWGEHCDIDFDDCQDNRCRNGAHCTDAVNGY 1094 

= 1 : Mil::: III : I hlh I :|: : III:: :,hll I I : I 
Db 179 YCTKKLNPCENNGKCIPINGSYSCMCSPGFTGNNCETNIDDCKNVECQNGGSCVDGILSY 238 



1095 TCICPEGYSGLFCEFSPPMV- • - LPRTSPCDNFDCQNGAQCIVRMEP • • ICQCLPGYQG 1149 

hi Ihl :|| III: :| I I I :|: I hi h I 
2 3 9 DCLCRPG YAGQ YCEI - PPMMDMEYQKIDACQQSACGQG ■ ECVASQNSSDFTCRCHEGFSG 2 9 6 



Qy 1150 EKCEKLVSVNFINRESYLQI-PSAKVRPQTNITLQIATDEDSGILLYRGDRDHIAVELYR 1208 

h: :ll 11:11:11 lh : I lllll II :: III 

Db 297 PSCDRQMSVGFKNPGAYLALDPLAS— DGTITMTLRTTSKIGILLYYGDDHFVSAELYD 353 

Qy 1209 GRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPRIITNLSKQST 1268 

III: I h III :|| :||| I : : :: I :| :|: I I 

Db 354 GRVKLVYYIGNFPASHMYSSVRVNDGLPHRISIRISERKCFLQIDRNPVQ1VENSGKSDQ 413 

Qy 1269 L--NFDSPLYVGGMPGRSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQTGIL 1326 

I Ihlhl : : : h :| I III :: II :lh 

Db 414 LITKGKEMLYIGGLPIEKSQDARRRFHVKNSESLKGCISSIIIN EVPINL— 463 

Qy 1327 PGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGTCLPINA 1386 

II I : I I 

Db 464 QQA LENVNTEQSCSAI 479 

Qy 1387 FSYSCRCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQP--YCECSSGYTGDSCD 1444 

MM = 111 : I hi! ::h II 

Db 1 480 VNFCAGIDCGNGRCTNNALSPKGYMCQCDSHFSGEHCD 517 

Qy 1445 -REISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQ-CCGPLRSKRRKYSFEC 1502 

: II :: I :: : : h: :: II I I II II :: hll I 

Db 518 ERRIKCDKQKFRRHHIENE— -CRSVDRIRIAECNGYCGGEQNCCTAVKRKQRKVRMIC 573 



1503 TDGSSFVDEVEKWKCGC 1520 

:h: : I; : :| I 
574 KNGTTRISTVHIIRQCQC 591 
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A24420 

notch protein -'fruit. fly (Drosophila melanogaster) 
N; Alternate' names: neurogenic repetitive locus protein 
C; Species: Drosophila 'melanogaster 

C'Date: 10-Sep-1999 isequence_revision 10-Sep-1999 ftext change 10-Sep-1999 

C'Accession: A24420;. A24768; S09358; A05267 

R;Kidd, s.; Kelley, M.R.; Young, M.W. 

Mol, Cell. Biol, 6, 3094-3108, 1986 

A;Reference number: A24420; MOID: 87064624 

A; Accession: A24420 ' 

A;Molecule type: DNA' 

A; Residues: 1-2703 <KID> 

A; Cross -references: GB:K03508; NID:gl57991; PIDN:AAA28725.1; PID:gl57993 

R;Wharton, R.A.; Johahsen, K.M.; xu, T,; Artavanis-Tsakonas, S. 

Cell 43, 567-581, 1985 

A;Reference number: A24768; MUID: 86079539 

A;Accession: A24768 

A;Molecule type: mRNA 

A;Residues: 1-48/1', 50-118, 'R', 120-230/1', 232-256, 'N', 258-266, 'A', 268-872, 'R', 874-9 
A;Note: the authors translated the codon AIC for residue 49 as Thr, AIT for residue 2 
R;Tautz, D, 

Nucleic Acids Res. 17, 6463-6471, 1989 

A; Title: Hypervariabllity of simple sequences as a general source for polymorphic "V 
A;Reference number: S09358; MUID: 89385974 
AjAccession: S09358 
A;Molecule type: DNA 

A;Residues: 2505-2551/QQQQ', 2552-2576, 'E', 2578-2604 <TAU> 

R;Wharton, R.A.; Yedvobnick, B.; Finnerty, V.G.; Artavanis-Tsakonas, S. 

Cell 40, 55-62, 1985 • 

A;Title: opa: a novel family of transcribed repeats shared by the Notch locus and ; 
A;Reference number: A05267; MUID:85099329 
AjAccession: A05267 : 
A;Molecule type: DNA 

A;Residues: 2504-2576', 'E' , 2578-2611 <WHA2> 
C; Genetics: 
A;Gene: notch; opa 

A; Cross -references: FlyBase:FBgn0004647 
A; Map position: 8. 96-^9 . 36 

A;Introns: 53/3; 84/3; 171/3; 240/3; 283/3; 2333/3; 2436/3; 2588/3 
C; Super family: notch protein; ankyrin repeat homology; EGF homology 
C; Keywords: differentiation; tandem repeat; transmembrane protein 
F;27-43/Domain: transmembrane f status predicted <TMM1> 
F;297-328/Domain: EGF homology <EGX1> 
F;530-561/Domain: EGF homology <EGFl> 
F;568-599/Domain: EGF homology <EGF> 
F;988-1019/Domain: EGF homology <EGX2> 
F;1064-1095/Domain: EGF homology <EGF3> 
F;1187-1218/Domain; EGF homology <EGX3> 
F;1746-1762/Domain: transmembrane tstatus predicted <TMM2> 
F;1950-1982/Domain: ankyrin repeat homology <AN1> 
F;1983-2015/Domain: ankyrin repeat homology <AN2> 
F;1988-2004/Domain: transmembrane tstatus predicted <TMM3> 
F;2017-2049/Domain: ankyrin repeat homology <AN3> 
F;2050-2082/Domain: ankyrin repeat homology <AN4> 
F;2083-2115/Domain: ankyrin repeat homology <AN5> 
F;2538-2568/Region: glutamine-rich 

F;2538-2568/Domain: neurogenic repetitive element tstatus predicted <OPA> 



Query Match ';• 9.8%; Score 818; DB 1; Length 2703; 

Best Local Similartty 29.3%; Pred. No. l.le-39; 

Matches 262; Conservative 93; Mismatches 336; Indels 202; Gaps 50; 

Qy 707 TCDDGNDDNSCSPLSRCPTECT- ■ -CLDT WRCSNKG LKVLPKG I PRDVT ELY LDGNQFT 763 

II II Ml III I I I I h : I I I 
Db • 307 TCIDGISDYTC— -RCPPNFTGRFCQDDVDECAQRDHPVCQNGATCTNTH 353 
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Qy 764 LVPKELSNYRHLTL IDLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFD 818 

I ;| : : :| III I I | 
Db 354 GSYSCICVNGWAGLDCSNNTDDCKQAACFYGAT CI D 389 

Qy 819 GLRSLRLLSLHGNDISWPEGAFNDLSALSHL* 'AIGANPLYCD- "CNMQWLSD — WV 870 

1:1 I I III :|| : II: :: 

Db 390 GVGSFYCQCTRGR TGLLCHLDDACTSNPCHADAICDTSPINGSYACSC 437 

Qy 871 KSEYR EPGI ARC - - AGPGEMADKLLLTTPSKRFTCQ GP - VDVNILARCNPCL 919 

: II jl I II : I I : I II : II II 
Db 438 ATGYKGVDCSEDIDECDQGSPCEHNGICVNTPGSYRCNCSQGFTGPRCETNI — NECE 493 

Qy 920 SNPCKNDGTCN^DPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGTCHLKEGEEDGFW 979 

l:M:|:|:| II I II I III |:: I I |||| : |||| | :|| 
Db 494 SHPCQNEGSCLDDPGTF-RCVCMPGFTGTQCEIDIDECQSNPCLNDGTCHDK— INGFK 549 

Qy 980 CICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFCAQD 1039 

I II IN h:|:lll: II III: III III II :: I I 
Db 550 CSCALGFTGARCQINIDDCQSQPCRNRGICHDSIAGYSCECPPGYTGTSCEININDC-D 607 

Qy 1040 LNPCQHDSKCILTPKGFKCDCIPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCICP 1099 

• III I Ml III I III I I ::|: I |: II II I I I 
608 SNPC-HRGKCIDDVNSFKCLCDPGYTGYICQKQINECESNPCQFDGHCQDRVGSYYCQCQ 666 

Qy 1100 EGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGYQGEKCEKLVSVN 1159 
Mill: : I : I III II II 1 1 1 : 1 1 : M III 
, Db 667 AGTSGKNCEVN VNECHSNPCNNGATC IDG INSYKCQCVPG FTGQHCEK 714 

: Qy 1160 FINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELYRGRVRASYDTGS 1219 
i I : : I I : I: :| II Mill: 
Db 715 --NVDECISSPCA NNGVCIDQVNG — YK CECPRG — FYD--A 748 

Qy 1220 HPASAI — YSVETINDGNFHIVELLALDQSLSLSVDGGNPKII--I--TNLSKQSTLNFD 1272 

M: I =1:1 II I I \ I: I: I 

Db 749 HCLSDVDECASNPCVNEGRCE DGINEFICHCPPGYTGKRCELDID 793 



•-SELQDFQ- 1317 



Qy 1273 SPLYVGG -MPGK SNVASLRQAPGQNG — TSFHGCIRNLYM - - 

:| II I I I : II I I: I: I ( :: :: 
Db 794 ECSSNPCQHGGTCYDKLNAFSCQCMPGYTGQKCETNIDDCVTNPCGNGGTCIDKVNGYKC 853 

Qy 1318 -KVPMQTGILPGCE— -PCHRKVCAH-GICQPSSQ-AGFTCECQEGWMGPLCDQRTND 1369 

ML II II II I: I III 1:1 I: |: I IM :: 
Db 854 VCRVPJ -TG--RDCESKMDPCASNRCKNEAKCTPSSNFLDFSCTCKLGYTGRYCDEDIDE 910 



?CLGllcVHG-TC 



1370 PCLG^CVHG-TCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLG 1428 

I : r J M :ll : II I I :|: II I I : |::| I hi 
911 CSLSSPCRNGASCLNVPG-SYRCLCTKGYEGRDCAINTD — DCASFPCQNGGTCLDGIG 966 
I 

1429 QPYCECSSGYTGDSCDREIS CR "GERI RDYYQKQQ GYAA- -CQTTKKVS 1474 

, I: I jh M: h I I |:: III : 

967 DYSCLqVDGFDGKBCETDINECLSQPCQNGATCSQYVNSYTCTCPLGFSGINCOTNDE- 1024 



1475 RLEC-HGGCA-GGflCCGPLRSKRRKYSFECTDGSSFVDEVEKWKCGCTRCVS 1525 

:| I II I : I: I I I : I: II M: 
1025 --DCTESSCLNGGSCIDGING YNCSCLAGYSGANCQYKLKKCDSNPCLN 1071 



RESULT 10 
A56136 

jagged protein precursor ■ rat 
C;Species: Rattus norvegicus (Norway rat) 

C;Date: 28-Apr-1995 §sequence_revision 28-Apr-1995 Itext.change ll-Jan-2000 
C; Accession: A56136 

R;Lindsell, C.E.; Shawber, C.J.; Boulter, J.; Weinmaster, G. 
Cell 80, 909-917, 1995 

A; Title: Jagged: a mammalian ligand that activates Notch! . 

A; Reference number: A56136; MUID: 95211842 

A; Accession: A56136 

A; Status: preliminary 

A; Molecule type: mRNA 

A;Residues: 1-1220 <LIN> 

A;Cross-references: GB.-L38483 

C'Superfamily: unassigned EGF-related proteins; EGF homology 



F;379-410/Domain: EGF homology <EGFl> 
F;492-523/Domain: EGF homology <EGF> 
F;634-665/Domain: EGF homology <EGF2> 



Query Match ' 9.2%; Score 768; DB 2; Length 1220; 

Best Local Similarity 25.4%; Pred. No. 3.4e-37; 

Matches 197; Conservative 101; Mismatches 246; Indels 232; Gaps 32; 

Qy 853 GANPLYCD CNMQWL — SDWVKSEYKEPGIARCAGPGEMADKLLLTT 896 

I MM ' II I ::l |:: II 
Db 257 GWQGLYCDKCIPHPGCVHGTCNEPWQCLCETNW GGQLCDK 296 

Qy 897 PSKKFTCQGPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHA 956 

1:1 III I III:: I |:|:|| |: I :|:: II 
Db 297 iDLNYCGTHQPCL NRGTCSNTGPDKYQCSCPEGYSGPNCEIAEHA 340 

Qy 957 CISNPGKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNY 1016 

1:1:11 : Ml II II I M -| : I I Mill Ml : II I : I : 
Db 341 CLSDPCHNRGSC-KE-TSSGFECECSPGWTGPTCSTNIDDCSPNNCSHGGTCQDLVNGF 397 

Qy 1017 TCLCPPEYTGELCEERLDFCAQ DLN 1041 

MIIM:|M M : I Ml 
Db 398 KCVCPPQWTGKTCQLDANECEAKPCVNARSCKNLIASYYCDCLPGWMGQNCDININDCLG 457 

Qy 1042 PCQHDSKCILTPRGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCICPEG 1101 

IMM I M:| I III MIM I Ml II II II : :| : Mil I 
Db 458 QCQNDASCRDLVNGYRCICPPGYAGDHCERDIDECASNPCLNGGHCQNEINRFQCLCPTG 517 

Qy 1102 YSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGYQGERCEKLVSVNFI 1161 

:M M ' M MUM I :: I: |:|: I | 
Db 518 FSGNLCQLD IDYCEPNPCQNGAQCYNRASDYFCRCPEDYEGKNCSHL 564 

Qy 1162 NKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELYRGRVRASYDTG- - - 1218 

: : : :| : : : I M :|: : : ::| 

Db 565 --KDHCRTTPCEVIDSCTVAM-ASNDTPEGV RYISSNVCGPHGKCKSESGGKF 614 

Qy 1219 — SHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNP RIITNLSKQSTL 1269 

: : I I III III I :: : 

Db 615 TCDCNKGFTGTYCHEMNDCE GNPCTNGGTCIDGVNSYKCI 655 

Qy 1270 NFDSPLYVGGMPG---KSNVASLRQAPGQNGTSFHGCIRNLYIN 1310 

I I: I ::M III: : : I : 
Db 656 CSD GWEGAHCENNINDCSQNPCHYGGTCRDLVNDFYCDCRNGWKGRTCHSRDSQ 709 

Qy 1311 vSELQDFQ KVPMQTGILPGCEPCHKKVCAHGTCQPS 1345 

'MM : : II III III : 

Db 710 CDEATCNNGGTCYDEVDTFKCMCPGGWEGTTCNIARNSSCLP--NPCHN — GGTCWN 763 

Qy 1346 SQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVH-GTCLPINAFSYSCKCLEGHGGVLCDE 1404 

: III Mill IMI I III I : I : MM : : I Ml I I I 
Db 764 GDS-FTCVCREGWEGPICTQNTND-CSPHPCYNSGTCVDGDNW-YRCECAPGFAGPDCRI 820 

Qy 1405 EEDLFNPCQAIKCRHGRCRLSGLGQPYCECSSGYTGDSCDREISCR GERIRDYYQ 1459 

I H: II:: II |::| I Ml I III: 
Db 821 N- - * IHECQSSPCAFGATCVDEINGYQCICPPGHSGARC • HEVSGRSCITMGRVILDGAK 876 

Qy 1460 RQQGYAACQTTRRVSRLECRGGCAGGQ-C — CGPLRSRRRRYSFECTDGSSFV 1509 

II • IMI III : I II :| I : 

Db 877 WDDDCNTCQ CLNGRVACSRWCGPRPCRLHRGHGECPNGQSCI 919 



RESULT 11 
A35844 

Mch protein - African clawed frog 

C; Species: Xenopus laevis (African clawed frog) 

C;Date: 12-Oct-1990 isequencejrevision 12-Oct-I990 itext.change 13-Aug-1999 

C; Accession: A35844 

R;Coffman, C; Harris, W.; Kintner, C. 
Science 249, 1438-1441, 1990 

A;Title: Xotch, the Xenopus homolog of Drosophila notch. 
A;Reference number: A35844; MUID: 90385285 
A; Access ion: A35844 
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A; Status: preliminary; nucleic acid sequence not shown; not compared with conceptual tn 
A; Molecule type: mRNA 
A; Residues: 1-2524 <COF> 
C; Super family: unassigned ankyrin repeat proteins; ankyrin repeat homology; EGF homology 
C; Keywords: transmembrane protein 
F;146-177/Domain: EGF homology <EGXl> 
F;184-215/Domain: EGF homology <EGF1> 
F;222-254/Domain: EGF homology <EGF> 
F;456-487/Domain: EGF homology <EGX2> 
F;757-788/Domain: EGF homology <EGF3> 
F;1025-1056/Domain: EGF homology <EGX3> 
F;1924-1956/Domain: ankyrin repeat homology <ANl> 
F;1957-1989/Domain: ankyrin repeat homology <AN2> 
F;1991-2023/Domain; ankyrin repeat homology <AN3> 
F;2024-2Q56/Domain: ankyrin repeat homology <AN4> 
F;2057-2Q89/Domain: ankyrin repeat homology <AN5> 



Pkiery Match 9.14; Score 759; DB 2; Length 2524; 

Best Local Similarity 26.8%; Pred. No. 2.8e-36; 
Matches 240; Conservative 100; Mismatches 352; Indels 204; Gaps 

Qy 733 WRCSNKGLKVLPKGIP--RDVTELYLDGNQFTLVPKELSNYKHLTLIDLSNN 783 

I: II I II :|: h |:| : : I I : I I 

Db 8 VLLCS - - 'LPVLTQGLRPCTQTAEMCLNGGRCEMTPGG TGVCLCGNLYFGERC 57 

Qy 784 ---RISTLSNOSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISW— P 837 
I: II : I II I II I : I 

58 QFPNPCTIKNQCMNFGT CEP VLQGNAIDFICHCP 91 



Db 

Qy 

Db 
' Qy 
. Db 

. Qy 



838 EGAFNDLSALSHL--AIGANPLYCDCNMQESDWVKS--EYKEPGIARCAGPGEMADRLL 893 
, I II I: : I II I : : I III II II I 
1 92 VG • FTDKVCLTPVDNACVNNP — CRNGGTCELLNSVTEYK — CRCP - PGWTGDSCQ 141 
«r> 

894 LTTPSKRFTCQG PVDVNILAKC NPCLSNPCKNDGICNSDPVD 935 

II I :: : II II III! I I :: 

142 QADPCASNPCMGGKCLPFEIQYICKCPPGFHGATCKQDINECSQNPCKNGGQCINE-FG 200 

936 FYRCTCPYGFRGQDCDVPIHACISNPCRHGGTCHLREGEEDGFWCICADGFEGENCEVNV 995 

lllll I |::|| I I :|l :|lll :: :: : I I II hill |: 
201 SYRCTCQNRFIGRNCDEPYVPCNPSPCLNGGIC-RQTDDTSYDCTCLPGFSGQNCEENI 258 

996 DDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFCAQDLNPCQHDSRCILTPKG 1055 

III 1:1 I 11111:1 I I !l::ll: I I :| I I II: I I I 
259 DDCPSNNCRNGGTCVDGVNTYNCQCPPDWTGQYCTEDVDECQLMPNACQNGGTCHNTYGG 318 

1056 FRCDCTPGYVGEHCDIDFDDCQDNKCRNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVL 1115 

: I I hill : III : Mill : I || I :|| I 
319 YNCVCVNGWTGEDCSENIDDCANAACHSGATCHDRVASFYCECPHGRTGLLCHLD 373 



Qy 1116 PRTSPCDNFDCQMGAQCIVR-IN-EPICQCLPGYQGEKCEKLVSVNFINKESYLQIPSAK 1173 

: I : I I: I :| : II I III I I I ::: I I : 
Db 374 • • -NACISNPCNEGSNCDTNPVNGRAICTCPPGYTGPACN NDVDECSLGANPCER 425 

Qy 1174 VRPQT NITLQ IATDEDSG ILLYKGDKDHIAVELYRGRVRAS YDTGSHPASAI YSVET IND 1233 

II I : 1:1 || 

Db 426 GGRCTNTLGSFQCNCPQG- • -YAGPRCEIDVN ECLSNPCQ ND 464 

Qy 1234 GNFRIVELLALDQ — SLSLSVDGG NPKI ITN- - LSKQSTLNFDS PLYVGGMP 1281 

II::: :: II : : I : II I 
Db 465 STCLDQIGEFQCICMPGYEGLYCETNIDECASNPCLHNGKCIDRINEFRCDCPTGFSGNL 524 

Qy 1282 GRSNVASLRQAPGQNGTSFHGCI - -RNLYINSELQDFQRVPMQTGI - - ■ LPGCEPCHRRV 1336 

: : I :|| I: I I : I : I :| :lll 
Db 525 CQHDFDECTSTPCRNGAK- • -CLDGPNSYTCQCTEGFTGRHCEQDINECIP- -DPCH — 576 

Qy 1337 CAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNRCVHGTCLPINAFSYSCRCLEG 1396 

:|||: I III I: h I III |: II |::| I I I :| 

Db 577 --YGTCK-DGIATFTCLCRPGYTGRLCDNDINE-CLSKPCLNGGQCTDRENGYICTCPKG 632 

Qy 1397 HGGVLCDEEEDLFNPCQAI RCRHGRC - - RLSGLGQPYCECSSG YTGDSCDREIS 1448 

II:: | : | :||| :: | II ||ll h h 
Db 633 TTGVNCEIKID—DCASNLCDNGRCIDKIDGY— ECTCEPGYTGRLCNININECDSNP 686 



Qy 1449 CR-GERIRDYYQK QQGY AACQTTRKVSRLECRGGCA 1483 

Ml: II II : : I I : 

Db 687 CRNGGTCKDQINGFTCVCPDGYHDHMCLSEVNECNSNPCIHGACHDGVNGYRCDCEAGWS 746 

Qy 1484 GGQC C— GPLRSKRRKYSFECTDGSSFVDEVERWKCGCTRCVS 1525 

II .11: | | | | : : :| |:: 
Db 747 GSNCDMNECESNPCMNGGTCKDMTGAYICTCKAGFSGPNCQTNINECSSNPCLN 802 



RESULT 12 
T30201 

Notch homolog protein ■ sea squirt (Halocynthia roretzi) 
C; Species: Halocynthia roretzi 

C;Date: 02-Sep-2000 fsequence_revision 02-Sep-2000 ttext.change 02-Sep-2000 
C; Accession: T30201 

R;Hori, S.; Saitoh, I,; Matsumoto, M. ; Makabe, K.W.; Nishida, H. 
Dev. Genes Evol. 207,'. 371*380, 1997 

A; Title: Notch homologue from Halocynthia roretzi is preferentially expressed in the 
A;Reference number: Z20775 
A;Accession: T30201 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A;Residues: 1-2352 <HOR> 

A; Cross -references: EMBL:AB001327; NID:dl204472; PID:dl026501; PIDN:BAA25571.1 
C;Genetics: : 
A; Gene: Notch 



Query Match : ' 9.1%; Score 757; DB 2; Length 2352; 

Best Local Similarity 23.1%; Pred. No. 3.4e-36; 

Matches 305; Conservative 139; Mismatches 480; Indels 396; Gaps 

Qy 438 CDCHLRWLAD: YLHTNPIETSGARCTSPRRLANKRIGQIRSKRFRCSGTEDYRSK 491 

II : I I II ::|l I : I hh I I 

Db 96 CTCQTGFTGDTCSQVLYCSPNPC - SNGAGC EELSNSFKCTCTSGYYGD 142 

Qy 492 LSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTV 543 

: : I: I: II III : III: 

Db 143 TCANDVNECDTPDICQNAGT— CSNND-— GGYSCSCVAGFEGNNCEVNIDDCSGHSC 195 

Qy 544 LEATGIFKKLPQLRKINFSNN--KITDIE--EGAFEGASGVN 581 

III: : : : : III I I:: III 
Db 196 QNGATCADAV3TYDCHCPAEWTGQYCTI-DVDECSLSNNVAKRRDLQQTEGGF 247 

Qy 582 EILLTSNRL- ENVQH KMFRGLESLKTLMLRSNRITCVGNDSF 622 

I I : ■ II: : : : I III I :|: 

Db 248 — -TCNCVYGFTRDDCSENIDDCSNVACFHNARCIDQAGTFECLCTPGKRILCHLDDAC 303 

Qy 623 IGLSSVRUSIYDNQIT — TVAPGAF- - -DTLHSLSTLNLLANP F 662 

I I- : ; I II | : | : :| II I 
Db 304 ISDPCARGATCDTNPITGHWMCDCPDGWTDRDCSKDIDECSLGGNPCEHNGQCNNTDGSF 363 

Qy 663 NCNCYLAWLGEWLRKRRIVTGNPRCQRPYFLREIPIQDVAIQDFTCDDGNDDNSCSPLSR 722 

I I : I' III: : I I I : 

Db 364 ECICVAGYSGr PRCE ININECEP-NP 387 

Qy 723 CPTECTCLDTWRCSNKGLKVLP- -KGI ■ -PRDVTELYLD- -GNQFTLVPKELSNYR- • • 773 

I : llll ; I :| II I: I : I I : |:: I 
Db 388 CRNDATCLDMI ■ - -GNFNCVCMPGFTGIICDEDIDECESNPCANGGTCI - DEVNAYTCSC 443 

Qy 774 "HLTLIDLSMRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLRSLRLLSLHG- 830 

I I I' I I :: I : :l III :| 
Db 444 ALGFTGDDCSQN- IDECASTPCMNKATCIDRANAY • ECECAPGYT GVHCE 491 

Qy 831 - - -NDISWP- -EGAFNDLSALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGP 885 

1 :| : HI: I II Mil : : —I : I 

Db 492 TNIDDCVINPCHYGSCRD G VNTFYCDC LLG YEGT KCQTDT NECASS PC ENG 542 

Qy 886 GEMADKLLLTTPSRKFTCQGPVDVN-ILARCNP--CLSNPCRNDGTCNSDPVDFYRCTCP 942 

I I:: ' :ll I : : II I: III: III I II I hi 
Db 543 GTCTDEI GYYTCTCPT GT SGSSCE INPDXVGNPCQY - GTC - VDGVDDY SCSCT 594 
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Qy 943 YGFKGQ DCDVP I HAC I SNPCKHGGTC H 969 

I: I: II h I llll :| II I 

i Db 595 PGYTGEHCDTDINECDSNPCMNGATCQNEVNNFVCQCPPGIMGIQCSSDIQECSSNPCLH 654 

Qy 970 LKEGEEDGFWCICADGFEGENCEVNVDDCEDN 1001 

: III |::||||| ::: I 

Db 655 EYARRDQHVHCICDAGYQGENCETEINECASNPCQHGACENKVAQFVSHCDAGYTGTACE 714 

Qy 1002 DCENNSTCVDGINNYTCLCPPEYTG EL CEEKLD 1034 

hi II 111:1 | || :||| II |:| I 

Db 715 IDIHECATQPCQNGGTCTSGINSYNCACPAKYTGVNCETELSPCVPNPCENGATCQESAD 774 

Qy 1035 F CAQDLH PCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFD 1074 

: II hi II:: I l::l I: h I: II II 

Db 775 YLAYVCQCPEGFRGPTCATDINECVNSPCKNGGGCTNLVPGYQCTCSQGFTGKDCDTDID 834 

Qy 1075 DCQDNKCKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIV 1134 

II I I II I I I I hi I: I I: : I :| hll I 

Db 835 DCSSNPCLNGGQCLDDVGSYKCLCLPGFEGNNCQ EEVNECASFPCKNGGICTD 887 



1135 RINEPICQCLPGYQGEKCEKLV — SVNFINKESYLQ-IPSAKVRPQTNITLQIATDED 1189 

:| :l II I: III : I : :| : : II I I 
888 YVNSYVCTCLSGFYSLDCEKNIEDCSSSSCMNGGTCVDGINSYSCSCTANFT 939 

1190 SGILLYKGDKDHIAV ELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLA 1243 

III II : I II: : :: II 
940 GDKCQNAVNNCASLQCQNGGT ■ CYYDSGDPRCACVHG YT GIHCESLQN 986 



Qy 1244 LDQSLSLSVDGGNPKI ITNLSKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHG - 1302 

I :: :ll I I I I :|| I :: I III 

Db 987 LCTGPNICKNGG SCVQTSNTVSCNCLGGYEGTD'-CAVPQVSCTVGASLLGI 1036 

: Qy 1303 CIRNLYINSELQDFQKVPMQTGILPG CEPCHKKVCAHG -TCQPSSQAGFTC 1352 
: :| :| : : I :l hill : ::| 

Db 1037 AVSDLCLNGGTCHDTSTAHECSCVAGFTGSYCDIDIDECASVPCKNGATCNDLINS-YSC 1095 

Qy 1353 ECQEGWMGPLCDQRTNDPCLGNKCVH-GTCLP-INAFSYSCKCLEGHGGVLCDEEEDLFN 1410 

I I: I I I I : I : III: II Mill lllh II 

Db 1096 ICALGYEGATC-LTDKDECASSPCKNGGTCIDRIN-SFYCSCLAGTEGVLCEINED— 1149 

Qy 1411 PCQAIKCKHGKCRLSGLGQPYCECSSGYTGDSCDREIS CRGERIRDYYQKQQGY 1464 

I: I :| : hi hi III II::: I II 

Db 1150 ECEINICLNGGVCIDGIGGFSCQCPSGYEGRRCQGDVNECLSNPCSSPGSLACIQGSNSY 1209 

Qy 1465 AACQTTKKVSRL'eCR- ■ -GGCAGGQCCGPLRSKRRKYSFECTDGSSFVDEVEKWKCGCT 1521 

I : lh I I I III I : II II 

Db 1210 -QCVCDADYTGSECQIRIGSCDINPCLN DGICTDNSQDITG — YKCQCT 1255 



JSULT 13 



itch protein homolog • rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 19-Feb-1994 »sequence_revisioD 10-Nov-1995 *text_change 13-Aug-1999 
C;Accession: S18188 

R;Weinmaster, G.; Roberts, V.J.; Lemke, G. 
Development 113, 199-205, 1991 

A; Title; A homolog of Drosophila Notch expressed during mammalian development. 
A;Reference number: S18188; MOID: 92111383 
A; Accession: S18188 
A;Molecule type: mRNA 
A;Residues: 1-2531 <WEI> 

A; Cross -references: EMBL:X57405; NID:g57634; PID:g57635 
C;Superfamily*. unassigned ankyrin repeat proteins; ankyrin repeat homology; EGF homology 
F;987-1018/Domain: EGF homology <EGFl> 
F;1025-1056/Domain: EGF homology <EGF> 
F;1233-1264/Domain: EGF homology <EGF2> 
F;1917-1949/Domain: ankyrin repeat homology <AN1> 
F;1950-1982/Domain: ankyrin repeat homology <AN2> 
F;1984-2016/Domain: ankyrin repeat homology <AN3> 
F; 2017 - 204 9/Domain: ankyrin repeat homology <AN4> 
F;2050-2082/Domain: ankyrin repeat homology <AN5> 



Query Match 9,0%; Score 747,5; DB 2; Length 2531; 

Best Local Similarity 23.4%; Pred. No. l,3e-35; 

229; Conservative 88; Mismatches 258; Indels 403; Gaps 39; 



Qy 881 RCAGPGEMADKLLLTTPSKK FTC QGPVDVNILAKCNPCLS 920 

II I • hll I : I lh : II I lh 

Db 56 RCQDPSP"-:-CLSTPCKNAGTCYWDHGGIVDYACSCPLGFSGPLCLTPLA--NACLA 108 

Qy 921 NPCKNDGTCN* SDPV DFYRCTCPY 943 

111:1 II: :|| I I II 

Db 109 NPCRNGGTCDLLTLTEYKCRCPPGWSGKSCQQADPCASNPCANGGQCLPFESSYICGCPP 168 

Qy 944 GFKGQDCDVPIHACISNP--CKHGGTCHLKEGE 974 

llll I II hlllill : I 
Db 169 GFHGPTCRQDVNECSQNPGLCRHGGTCHNEIGSYRCACRATHTGPHCELPYVPCSPSPCQ 228 

Qy 975 EPGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYT 1025 

: I Ml hill Mill hhl MM: I I ||||:| 
Db 229 NGGTCRPTGDTTHECACLPGFAGQNCEENVDDCPGNNCKNGGACVDGVNTYNCRCPPEWT 288 

Qy 1026 GELCEEKLD-- 1034 

h II :| ■ 

Db 289 GQYCTEDVDECQLMPNACQNAGTCHNSHGGYNCVCVNGWTGEDCSDNIDDCASAACFQGA 348 

Qy 1035 - F 1035 

Db 349 TCBDRVASFYCECPHGRTGLLCHLNDACISNPCNEGSNCDTNPVNGKAICTCPRGYTGPA 408 

Qy 1036 CAQDL --NPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCT 1088 

1 : 1 1 : MM lh I hi I II I hll ::| I hi I I 

Db 409 CSQDVDECALGANPCEHAGKCLNTLGSFECQCLQGYTGPRCEIDVNECISNPCQNDATCL 468 

Qy 1089 DAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGYQ 1148 

I : : III'. II l::H : I I : I : :|: MM MM h 

Db 469 DQIGEFQCICMPGYEGVYCEIN TDECASSPCLHNGRCVDKINEFLCQCPKGFS 521 

Qy 1149 GEKC EKLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAV 1204 

II ::' I I I h I : : I : : : I I : 

Db 522 GHLCQYDVDECASTPCRNGARCLDGPNT — YTCVCTEGYTGTHCEVDIDECDPDPCHI 577 

Qy 1205 ELYRGRVRASYDTGSHPASAI YSVET ■ IND GNFHIVELLA 1243 

1:11:: I : II lh 1 = :: I 

Db 578 GLCKDGV-ATFTCLCQPGYTGHHCETNINECHSQPCRHGGTCQDRDNYYLCLCLKGTTGP 636 

Qy 1244 LDQSLSLSVDGGNPKI ITNLSKQSTLNFD SPLYVGGMPGKSNVASLRQA 1292 

Mill III I llllh : 

Db 637 NCEINLDDCASNPCDSG TCLDK IDGYECACEPGYTGSM-CNVNIDECAGS 685 

Qy 1293 PGQNGTSFHGCIRNLY I KSELQDFQKVPMQTG I L PGC — EPCHKKV 1336 

II : ' Ml III: 

Db 686 PCHNGGT — - CEDGIAGFTCRCPEGYHDPTCLSEVNECNSNP 724 

Qy 1337 CAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVH-GTCLPINAFSYSCKCLE 1395 

I II I: ■ I: hi II I II I: I I lh III : : I I I I 
Db 725 CIHGACRDGLN-GYKCXAPGWSGTNCDINNNE-CESNPCVNGGTCKDMTS-GYVCTCRE 781 

Qy 1396 GHGGVLCDEE- - EDLFNPCQAIKCKH- 1419 

I I ; I : II lh 

Db 782 GFSGPNCQTNINECASNPCLNQGTCIDDVAGYKCNCPLPYTGATCEWLAPCATSPCRNS 841 

Qy 1420 GKCRLSGLGQPY -CECSSGYTGDSCDREIS CRGERIRDYYQKQQGYAACQTTKK 1472 

I I: I :': I I :h I :|: :|: II I hll I 

Db 842 GVCKESEDYESFSCVCPTGWQGQTCEIDINECVRSPCR-" HG-ASCQNTNG 889 



Qy 1473 VSRLECRGGCAGGQC CGPLRSKRRKYSFECTDG - -SSFVDEV EKV 1515 

I I: Ml M llll ::| M M 

Db 890 SYRCLCQAGYTGRNCESDIDDCRPNPCHN- • -GGSCTDGVNAAFCDCLPGFQGAFCEEDI 946 

Qy 1516 VKC GCTRCV 1524 

:| : II II 

Db 947 NECATNPCQNGANCTDCV 964 
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RESULT 14 
A46Q19 

Notch-1 protein - mouse 
N; Alternate names: notch protein 
C; Species: Mus musculus (house mouse) 

C;Date: 22-Sep-1993 tsequencejrevision 18-Nov-1994 ftext change 20-Sep-1999 
C; Accession: A46019; S25144 

R;del Amo, F.F.; Gendron-Maguire, M.; Swiatek, P.J.; Jenkins, N.A.; Copeland, N.G.; Grit 
Genomics 15, 259-264, 1993 

AjTitle: Cloning, analysis, and chromosomal localization of Notch-1, a mouse homolog of 
A;Reference number: A46019; MUID: 93194170 
A; Accession: A46019 

A; Status: not compared with conceptual translation 
A; Molecule type; nucleic acid 
A;Residues: 1-2531 <DEL> 

A/Cross-references: GB:Z11886; GB:S47228; NID:g288502; PIDN:CAA77941.1; PID:g288503 
-;Note: sequence extracted from NCBI backbone (NCBIP:127318) 
franco del Amo, P.; Smith, D.E.; Swiatek, P.J.; Gendron-Maguire, M.; Greenspan, R.J.; 
litted to the EMBL Data Library, April 1992 
A; Description: Expression pattern of Motch, a mouse homolog of Drosophila Notch, suggest 
A; Reference number: S25144 
A; Accession: S25144 
A;Molecule type: mRNA 

A;Residues: 1551-2108, 'Q' ,2110-2114, 'ALP', 2118-2170 <FRA> 
A;Cross-references: EMBL:Z11886 
C; Genetics: 
A; Gene: notch-1 
A; Map position: 2 

A; Note: proximal region of chromosome 2 
C; Super family: unassigned ankyrin repeat proteins; ankyrin repeat homology; EGF homology 
F;106-138/Domain: EGF homology <EGF1> 
F;144-175/Domain: EGF homology <EG01> 
F;222-254/Domain: EGF homology <EGF2> 
F;261-292/Domain: EGF homology <EG02> 
F;339-370/Domain: EGF homology <EG03> 
F;416-449/Domain: EGF homology <EGF3> 
F;456-487/Domain: EGF homology <EG04> 
F;494-525/Domain: EGF homology <EG05> 
F;532-563/Domain: EGF homology <EG06> 
F;607-638/Domain: EGF homology <EG07> 
F;682-713/Domain: EGF homology <EG08> 
F;757-788/Domaln: EGF homology <EG09> 
F;795-826/Domain: EGF homology <EG10> 
F;873-904/Domain: EGF homology <EG11> 
£;911-942/Domain: EGF homology <EG12> 
1949-980/Domain: EGF homology <EG13> 
P!987-1018/Domain: EGF homology <EG14> 
F;1025-1Q56/Domain: EGF homology <EG15> 
F;1063-1094/Domain: EGF homology <EG16> 
F;1149-1180/Domain: EGF homology <EG17> 
F;1187-1218/Domain: EGF homology <EG18> 
F;1233-1264/Domain: EGF homology <EGF4> 
F;1352-1383/Domain: EGF homology <EG19> 
F;1391-1425/Domain: EGF homology <EGF> 
F;1917-1948/Domain: ankyrin repeat homology <ANl> 
F;1949-1981/Domain: ankyrin repeat homology <AN2> 
F;1983-2015/Domain: ankyrin repeat homology <AN3> 
F;2016-2048/Doraain: ankyrin repeat homology <AN4> 
F;2049-2081/Domain: ankyrin repeat homology <AN5> 



Query Match 9.0%; Score 745; DB 2; Length 2531; 

Best Local Similarity 24.44; Pred. No, 1.8e-35; 

Matches 238; Conservative 103; Mismatches 309; Indels 326; Gaps 46; 

Qy 662 FNCNCYLAWLGEWLRRKRIVTGNPRCQRPiFLKEIPIQDVAIQDFTCDDGNDDNSCSPLS 721 

Mil I I: II I I :| :| I 

Db 202 YRCACCATHTG PHCELPY VPCSPSPCQNG— ATCRPTG 237 

Qy 722 RCPTECTCLDTW RCSNKGLKVLPKGI PRDVTELiL"D 758 

II II I I I I I: Ml I 

Db 238 DTTHECACLPGFAGQNCEENVDDCPGNNCKNGGACV-DGVNTlNCRCPPEVTGQrCTED 295 

I 



Qy 759 GNQFTLVPKELSNYKHLTLIDLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFD 818 

:: 1:1 1 I II 
Db 296 VDECQLMPNACQN AGTCHN 314 

Qy 819 GLKSLRLLSLHGNDISWPEGAFNDLSALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPG 878 

I :| III: : 

Db 315 THGGYN CVCVNGWTGEDCSENIDDCA 340 

Qy 879 IARCAGPGEMADKLLLTTPSKKFTCQGPVD-VNILAKC-NPCLSNPCKNDGTCNSDPVDF 936 

I I I:: 11:1 :| : hllll |:::||: 

Db 341 SAACFQGATCHDRV ASFYCECPHGRTGLLCHLKHACISNPCNEGSNCDTNPVNG 394 

Qy 937 YR-CTCPYGFKG — QD CDVPIH 955 

I Mil I:.'l II . |:: :: 

Db 395 KRICTCPSGYTGPACSQDVDECDLGANRCEHAGKCLNTLGSFECQCLQGYTGPGCEIDVN 454 

Qy 956 ACISNPCKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINN 1015 
Mill::. I : II I III |:|| Ihl |:| : I :| |:| |: 
455 ECISNPCQNDATCLDQIGE-- -FQCICMPGYEGVYCEINTDECASSPCLHNGHCMDKIHE 511 



Db 



Qy 1016 YTCLCPPEYTGELCEEKLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDD 1075 

: I II : III: :l II II:: :ll: I : I II II ||::| |: 

Db 512 FQCQCPKGFNGHLCQYDVDECAS-TPCKNGAKCLDGPNTYTCVCTEGYTGTBCEVDIDE 569 

Qy 1076 CQDNKCKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVR 1135 

I : I h'l I I :lhl 11:1 II : I : |::| I I 

Db 570 CDPDPCHYGS -CKDGVATFTCLCQPGYTGHHCE TNINECHSQPCRHGGTCQDR 621 

Qy 1136 INEPICQCLPGYQGEKCEKLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLY 1195 

I :l II I I II II : I HI I 

Db 622 DNSYLCLCLKGTTGPNCE INLDDCASNPC DSGTCLD 657 

Qy 1196 KGDKDHIAVEL-YRGRVRASYDTGSHPASAIYSVETINDG NFHIVELLAL 1244 

Nihil::: | :: | || :| 
Db 658 KIDGYECACEPGYTGSM-CNVNIDECAGSPCHNGGTCEDGIAGFTCRCPEGYH 709 

Qy 1245 DQSLSLSVDGGNPKIITNLSKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCI 1304 

:| I II: : I :| I I III :ll 
Db 710 j-DP— TCLSEVNECN-SNPCIHGACRDGLNGYKCDCAPGWSGT 748 

Qy 1305 RNLY1NSELQDFQKVPMQTGILPGCEPCHKKVCAH-GTCQPSSQAGFTCECQEGWMGPLC 1363 

I II: II: III: : :|: I hlh II I 

Db 749 -NCDINN— • NECESNPCVNGGTCKDMT-SGYVCTCREGFSGPNC 788 

Qy 1364 DQRTNDPCLGNKCVH-GTCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKH-GK 1421 

I: I I I:: III: : I I I : I I I : II Ihl 

Db 789 QTNINE-CASNPCLNQGTCID-DVAGYRCNCPLPYTGATC— EWLAPCATSPCKNSGV 843 

Qy 1422 CRLSGLGQPY-CECSSGYTGDSCDREIS CRGERIRDYYQKQQGYAACQTTKKVS 1474 

1:1 : :il I :|: I :|: :|: II I |:|| I 

Db 844 CKESEDYESF3CVCPTGWQGQTCEVDINECVKSPCR HG-ASCQNTNGSY 891 

Qy 1475 RLECRGGCAGGQC CGPLRSKRRKYSFECTDG - -SSFVDEV EKWK 1517 

M:M' II I II I ::| I : I : : 

Db 892 RCLCQAGYTGSNCESDIDDCRPNPCHN- - -GGSCTDGINTAFCDCLPGFQGAFCEEDINE 948 



Qy 1518 C GCTRCV 1524 

I 'II II 

Db 949 CASNPCQNGANCTDCV 964 



RESULT 15 

A49175 ; 
Motch B protein - mouse (fragment) 
N; Alternate names: Notch homolog 
C; Species:' Mus musculus (house mouse) 

C;Date: 21-Jan-1994 isequence_revision 05-Jan-1996 ttext.change 20-Sep-1999 
C; Accession: A49175;, PH1570; S32113 
R;Lardelli, M.; Lendahl, o. 
Exp. Cell Res. 204, 364-372, 1993 

A; Title: Motch A and, Motch B--two mouse Notch homologues coexpressed in a wide variet 
A;Reference number: A49175; MUID : 93178563 
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A; Accession: A49175 

A; Status: preliminary; nucleic acid sequence not shown 
A; Molecule type: raRNA 
A; Residues; 1-1203 <LAR> 

A; Cross -references: EMBL;X68279; NID:g287989; PIDN:CAA48340.1; PID:g287990 
A; Experimental source: embryo 

A; Note,: sequence extracted from NCBI backbone (NCBIP: 126158) 
CComment: This protein has many EGF repeats and lin-12/Notch repeats. 
C; Comment: This protein is one of the neurogenic proteins controlling the decision betvrc 
C; Super family: unassigned ankyrin repeat proteins; ankyrin repeat homology; EGF homologj 
F;143-174/Domain: EGF homology <EGX1> 
F;482-513/Domain: EGF homology <EGF1> 
F;560-591/Domain: EGF homology <EGF> 
F;674-705/Domain: EGF homology <EGX2> 
F;712-743/Domain: EGF homology <EGF3> 
F;836-867/Domain: EGF homology <EGX3> 



Query Match 8.9%; Score 736; DB 2; Length 1203; 

Best Local Similarity 22.9%; Pred. No. 2.5e-35; 

Matches 299; Conservative 111; Mismatches 372; Indels 526; Gaps 61; 

B 434 NPFICDCHLKWLAD YLHTNPIETSGARCT SPRRLANKRIGQI 475 

W II II I I I: I I II II :l I I : 

Db 65 NP — CHKGALCDTNPLNGQYICTCPQGYKGADCTEDVDECAMANSNPCEHAGKCVB-- 118 

Qy 476 KSKKFRCSGTEDY---RSKL SGDCFADLACPEK CRCEGTT 512 

I I : I I I I I I :| II 

Db 119 TDGAFHCECLKGYAGPRCEMDINECHSDPCQNDATCLDKIGGFTCLCMPGFRGVBCELEV 178 

Oy 513 VDC SNQRLNK I PEHIPQYTAELRLNNNEFT VLEATG IFKKLPQLRK I NFSNN - 564 

:l : I "I: I II I : I: : I: 

Db 179 NECQSNPCVNNGQCVDKV NRFQCLCPPGFTGPVCQIDIDDCSSTP 223 

Oy 565 KITDIEEG - AFEGASGVNEILLTSNRLEN VQHKMFK - GLESLKT LMLRSN 612 

I I I : 1:1 II I ::| I : |::| 
Db 224 CLNGAKCIDHPNGYECQCATGFTGILCDEN-IDNCDPDPCHHGQCQDGIDS 273 

Qy 613 RITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLN LLANPFNCNCY 667 

II: I ::l :: :ll I :| II II: III 

Db 274 -YTCICNPGYMG AICSDQI DECYSSPCLNDGRC IDLVNGYQCNCQ 317 

Qy 668 LAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVA- - - IQDFTCDDGNDDNS - -CSP - - * 719 

II II I I I II : I III 

Db 318 PG TSGLNC EINFDXASNPCMHGVCVDGINRYSCVCSPGFT 358 

Oy 720 LSRCPTECTCLDTV - -VRCSNKGLKVLPKGIPRDVTELYLDGNQFTLV 765 

: I II:: I II : hi 
Db 359 GQRCNIDIDECASNPCRKGATCINDVNGFRC ICPEG 394 

V 766 PKELSNYKHLTLIDLSKNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRL 825 
^ III: I: II II 
Db 395 PHHPSCYSQV NECLSN PCI 413 

Oy 826 LSLHGNDISWPEGAFNDLSALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGP 885 

III I : I I: I 
Db 414 — HGNCTG GLSGYRCLCDAGW 432 

Qy 886 GEMADKLLLTTPSKKFTCQGPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGF 945 

I II I 111111:1 Mil : I: Mill II 
Db 433 VGVNCEVTMECLSNPCQNGGTCN-NLVNGYRCTCKKGF 470 

Qy 946 KGQDCDVPIHACISNPCKHGGTCH 969 

II :| I I I Mil : III 

Db 471 KGYNCQVNIDECASNPCLNQGTCFDDVSGYTCHCMLPYTGKNCQTVLAPCSPNPCENAAV 530 

Oy 970 LKEGEE-DGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEyiGEL 1028 

II : I Ml I |:||:| III: :| III ::| 
Db 531 CKEAPNFESFSCLCAPGWQGKRCTVDVDECISRPCMNNGVCHNTQGSYVCECPPGFSGMD 590 

Qy 1029 CEEKLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCT 1088 

III :: I Nil: h III ll::|: I I ::| III |: 

Db 591 CEEDINDCL--ANPCQNGGSCVDHVNTFSCQCHPGFIGDKCQTDMNECLSEPCKNGGTCS 648 



Qy 1089 DAVNGYTCICPEGYSGLFCE FSPPMVLPR 1117 

I II III II I: I: II 1:11 
Db 649 DYVNSYTCTCPAGFHGVHCENNIDECTESSCFNGGTCVDGINSFSCLCPVGFTGPFCLHD 708 

Qy 1118 TSPCDNFDCQNGAQCIVRINEPICQCLPGYQGEKCEKLVSV — NFINKESYLQIPSAK 1173 

: I : I |: : I I II |: |: l|:: || : :| I 
Db 709 INECSSNPCLWAGTCVDGLGTYRCICPLGYTGKNCQTLVNLCSRSPCKNKGTCVQ— - EK 765 

Qy 1174 VRPQ TMTLQIAT DEDSGILLYKGDKDHIAVELYRG 1209 

II I" : I : III : |: I I 

Db 766 ARPHCLCPPGTOYCDVLNVSCKAAALQKGVPVEHLCQHSGICINAGNTHHCQCPL--- 822 

Qy 1210 RVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLSKQSTL 1269 

I III: II II: I II :| 
Db 823 ■— -GY-TGS* YCEE QLDECAS NP CQHGATC 847 

Qy 1270 NFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQTGILPGC 1329 

I ::ll\ III: I: : II I I 

Db 848 ND — FIGGY RCECVPGYQGVN CEYEVDECQNQPCQNG 882 

Qy 1330 EPCHRKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGTCLPINAFSY 1389 

. Ill Mill II:: :: I |::| I 
Db 883 GTCIDLVN-HFKCSCPPGIRGLLCEENIDECAGGPHCLNGGQCVDRIGGY 931 



Qy 1390 SCKCLEGHGGVLC-DEEEDLFNPCQ— AIKCKHGK- 

:hll I I I I I I III :: I I II : |: 
Db 932 TCRCLPGFAGERCEGDINECtSNPCSSEGSLDCVQLKNNYNCICRSAFTGRHCETFLDVC 991 



Qy 1431 



YC ECSSG YTGDSCDR - ■ - EI SC ■ RG ERI RDYYQKQQGY A 1465 
I I l::l I :: I III: 
Db 992 PQKPCLNGGTCAVASNMPDGFICRCPPGFSGARCQSSCGQVKCRRGEQ 1039 

Qy 1466 ACOTTKKVSRL ECRGGCAGGQC CGPLRSKRRKYSFEC 1502 

III :| III I III II I 
Db 1040 -CIHTDSGPRCFCLNPKDCESGCASNPCQHGGTCYPQRQPPH-YSCRC 1085 
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Job time: 1766 sec 
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GenCore version 4.5 
Copyright (c) 1993 ■ 2000 Compugen Ltd. 



OM protein ■ protein search, using sw model 
Run on: January 22, 2001, 12:08:19 ; 



Search time 162.41 Seconds 

(without alignments) 

303.236 Million cell updates/sec 



Title: 

Perfect score: 



Scoring table: 



US-09-540-245A-2 
8316 

1 MRGVGWQMLSLSLGLVLAIL, . 



..SSFVDEVEKWKCGCTRCVS 1525 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Irched: 



88757 seqs, 32294092 residues 

Total number of hits satisfying chosen parameters: 88757 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

Database : SwissProt_39:* 

Pred. No. is the number of results predicted 'by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

; SUMMARIES 
1 % 

Query 



no; 


Score 


Match Length D 


3 ID 


Description 


i 


3486 


41.9 1480 


SLIT.DROME 


P24014 drosophila 


2 


818 


9.8 2703 


NOTC.DROME 


P07207 drosophila 


3 


759 


9.1 2524 


NOTCJENLA 


P21783 xenopus lae 


4 


747.5 


9.0 2531 


NTC1JAT 


Q07008 rattus norv 


5 


745 


9.0 2531 


NTClJiOUSE 


001705 mus musculu 


6 


735.5 


8.8 2444 


NTC1JUMAN 


P46531 homo sapien 


7 


732.5 


8.8 2437 


NOTCJSRARE 


P46530 brachydanio 


8 


731 


8.8 1064 


FBP1.STRPU 


P10079 strongyloce 


9 


722 


8.7 2318 


NTC3JOUSE 


Q61982 mus musculu 


10 


691.5 


8.3 1964 


NTC4J0USE 


P31695 mus musculu 


11 


634.5 


7.6 2139 


CRBJ3R0ME 


P10040 drosophila 


12 


625,5 


7.5 1408 


SERR.DROME 


P18168 drosophila 


13 


606 


7.3 1429 


LI12.CAEEL 


P14585 caenorhabdi 


14 


549.5 


6.6 570 


FBP3.STRPU 


P49013 strongyloce 


15 


543 


6.5 1295 


GLP1_CAEEL 


P13508 caenorhabdi 


16 


535.5 


6.4 473 


FP2JYTGA 


Q25464 mytilus gal 


17 


533.5 


6.4 723 


DLLlJUMAN 


000548 homo sapien 


18 


521.5 


6.3 714 


DLLlJAT 


P97677 rattus norv 


19 


518.5 


6.2 722 


DLLlJiOUSE 


Q61483 mus musculu 


20 


493.5 


5.9 603 


ALSJOUSE 


P70389 mus musculu 


21 


491 


5.9 833 


DLJROME 


P10041 drosophila 


22 


489 


5.9 603 


ALSJAT 


P35859 rattus norv 


23 


484.5 


5,8 605 


ALSJUMAN 


P35858 homo sapien 


24 


479.5 


5.8 605 


ALS.PAPPA 


O02833 papio papio 


25 


455.5 


5.5 1134 


CHAOJ)ROME 


P12024 drosophila 


26 


442.5 


5.3 3051 


YNX3_CAEEL 


P34576 caenorhabdi 


27 


410 


4.9 567 


GPVJiOUSE 


008742 mus musculu 


28 


403.5 


4.9 5147 


FATJROME 


P33450 drosophila 


29 


400 


4.8 567 


L GPV_RAT 


008770 rattus norv 


30 


400 


4.8 2871 


FBN1J0VIN 


P98133 bos taurus 


31 


397 


4.8 2871 


FBN1J0MAN 


P35555 homo sapien 


32 


393.5 


4.7 2871 


FBNlJiOUSE 


Q61554 mus musculu 


33 


393 


4.7 3707 


PGBMJOUSE 


Q05793 mus musculu 



34 


390.5 


4.7 


4393 


PGBMJJUMAN 


35 


390 


4.7 


662 


GARP.HUMAN 


36 


389.5 


4.7 


2911 


FBN2J0MAN 










A(jKl_KAi 


38 


386!s 


4^6 


2907 


FBN2JM0USE 


39 


382.5 


4.6 


560 


GPVJUMAN 


40 


382 


4.6 


4590 


FATHJOMAN 


41 


376.5 


4.5 


1097 


TOLL_DROME 


42 


375.5 


4,5 


385 


DLKJ4QUSE 


43 


372 


4,5 


383 


DLKJUMAN 


44 


372 


4.5 


536 


CBP8JUMAN 


45 


371 


4.5 


4289 


TENXJUMAN 



P98160 homo sapien 
Q14392 homo sapien 
P35556 homo sapien 
P25304 rattus norv 
061555 mus musculu 
P40197 homo sapien 
Q14517 homo sapien 
P08953 drosophila 
Q09163 mus musculu 
P80370 homo sapien 
P22792 homo sapien 
P22105 homo sapien 



RESULT 1 
SLIT.DROME 
ID 
AC 
DT 
DT 
DT 



STANDARD; PRT; 1480 AA. 

P24014; 

01-MAR-1992 (Pel. 21, Created) 
01-MAR-1992 (tel. 21, Last sequence update) 
01-FEB-1996 (tel. 33, Last annotation update) 
SLIT PROTEIN PRECURSOR. 
SLI. 

Drosophila melanogaster (Fruit fly) . 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

Pterygota^ Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

Ephydroidea;' Drosophilidae; Drosophila. 

[1] ■ ' . 

SEQUENCE FROM N.A. 

MEDLINE-91099665; PubMed-2176636; 

Rothberg J.M., Jacobs J.R., Goodman C.S., Artavanis-Tsakonas S.; 
"Slit: an extracellular protein necessary for development of midline 
glia and commissural axon pathways contains both EGF and LRR 



Genes Dev. 4:2169-2187(1990). 

FUNCTION: NECESSARY FOR DEVELOPMENT OF MIDLINE GLIA AND 
COMMISSURAL AXON PATHWAYS. SLIT MAY INTERACT WITH EXTRACELLULAR 
MATRIX MOLECULES. 

-I- ALTERNATIVE PRODUCTS: GIVES RISE TO 2 DISTINCT PROTEINS DIFFERING 

BY 11 AA AT, THE C-TERMINUS OF THE LAST EGF REPEAT. 
-!- TISSUE SPECIFICITY: EXCRETED BY THE MIDLINE GLIA CELLS AND 

EVENTUALLY DISTRIBUTED ALONG THE AXONS. 
-!- SIMILARITY: CONTAINS 7 EGF-LIKE DOMAINS. 
-!- SIMILARITY: THE REPEATED LEUCINE-RICH (LRR) SEGMENT IS FOUND IN 

MANY PROTEINS, NUMBER IN THIS PROTEIN: 22. TWO BLOCK OF 6 LRR'S 

AND TWO BLOCKS OF 5 LRR'S. 
-!- SIMILARITY: CONTAINS A C-TERMINAL CYSTINE KNOT -LIKE DOMAIN (CTCK). 

This SWISS-PROT entry is copyright, It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstatior. ■ 
the European Bioinformatics Institute. There are no restrictions on it? 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/annouriC2/ 
or send an email to license?isb-sib.ch). 

EMBL; X53959; CAA37910.1; -. 
PIR; A36665; A36665. 
HSSP; P00740; 1IXA. 
FLYBASE; FBgn0003425; sli, 



DR INTERPRO 


IPR0Q0152 




DR INTERPRO 


IPR000359 




DR INTERPRO 


IPR000372 




DR INTERPRO 


IPR000483 




DR INTERPRO 


IPR000561 




DR INTERPRO 


IPR001611 




DR INTERPRO 


IPR001791 




DR INTERPRO 


IPR001881 





PFAM; PF00007; Cysjmot; 1. 
PFAM; PF00008; EGF; 7. 
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DR PFAM; PF00560; LRR; 16, 
DR PFAM; PFQ1463; LRRCT; 4. 
DR PFAM; PF01462; LRRNT; 4. 
DR PFAM; PF00054; laminio_G; 1. 
DR PROSITE; PS00010; ASXJYDROXYL; 3. 
DR PROSITE; PS00022; EGF_1 ; 7. 
DR PROSITE; PS01185; CTCK_1; 1. 
DR PROSITE; PS01186; EGFJ; 5. 
DR PROSITE; PS01187; EGF.CA; 2. 
DR PROSITE; PS01225; CTCKJ; 1. 

Neurogenesis; Glycoprotein; Signal; Alternative splicing; 
-like domain; Repeat; Leucine-repeat; Duplication, 



SLIT PROTEIN. 

CONSERVED N-FLANKING REGION OF THE LRR, 

LEUCINE-RICH REPEATS (1ST REGION). 

CONSERVED C-FLANKING REGION OF THE LRR. 

CONSERVED N-FLANKING REGION OF THE LRR. 

LEUCINE-RICH REPEATS (2ND REGION) , 

CONSERVED C- FLAMING REGION OF THE LRR. 

CONSERVED N-FLANKING REGION OF THE LRR. 

LEUCINE-RICH REPEATS (3RD REGION). 

CONSERVED C-FLANKING REGION OF THE LRR. 

CONSERVED N-FLANKING REGION OF THE LRR. 

LEUCINE-RICH REPEATS (4TH REGION). 

CONSERVED C-FLANKING REGION OF THE LRR. 

LRR 1-1. 

LRR 1-2. 

LRR 1-3. 

LRR 1-4. 

LRR 1-5. 

LRR 1-6. 

LRR 2-1. 

LRR 2-2. 

LRR 2-3. 

LRR 2-4. 

LRR 2-5. 

LRR 2-6. 

LRR 3-1. 

LRR 3-2. 

LRR 3-3. 

LRR 3-4. 

LRR 3-5. 

LRR 4-1. 

LRR 4-2. 

LRR 4-3. 

LRR 4-4. 

LRR 4-5. 

EGF-LIKE 1. 

EGF-LIKE 2. 

EGF-LIKE 3, CALCIUM-BINDING (POTENTIAL) . 
EGF-LIKE 4. 

EGF-LIKE 5, CALCIUM-BINDING (POTENTIAL). 
EGF-LIKE 6. 
EGF-LIKE 7. 
CTCK. 

N-LINKED (GLCNAC. , .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . ,) (POTENTIAL). 
N-LINKED (GLCNAC. , .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . ,) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . ,) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
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w 
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FT 
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DISULFID 


989. 
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BY SIMILARITY, 
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DISULFID 
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BY SIMILARITY. 


FT 


DISULFID 


1028' 


1041 


BY SIMILARITY, 
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BY SIMILARITY. 
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BY SIMILARITY. 


FT 
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1073' 
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BY SIMILARITY , 


FT 


DISULFID 


1090; 


1099 


BY SIMILARITY. 


FT 


DISULFID 


1115; 
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BY SIMILARITY. 
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DISULFID 


1120' 
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BY SIMILARITY. 
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BY SIMILARITY. 
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BY SIMILARITY. 
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1382' 


1391 


BY SIMILARITY, 


FT 
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1409, 


1443 


BY SIMILARITY. 


FT 


DISULFID 


1423,' 


1457 


BY SIMILARITY. 
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1434,' 
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BY SIMILARITY. 


FT 
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1475 
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FT 
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1479 


BY SIMILARITY. 
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1394. 
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MISSING (IN SHORT ISOFORM) . 


SQ 


SEQUENCE 


1480 AA; 165752 MW; F9D5925FC170B1C3 CRC64; 



Query Match 41.9%; Score 3486; DB 1; Length 1480; 

Best Local Similarity 43.8%; Pred, No. 1.5e-210; 

Matches 660; Conservative 275; Mismatches 457; Indels 116; Gaps 22; 

Qy 28 CPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITRITKTDFAGLRHLRVLQLM 87 

I 111:1 III I Mil I : 111:1' IN: I :||| I INN 
Db 73 CPRVCSCTGLNVDCSHRGLTSVPRKISADVERLELQGNNLTVIYETDFQRLTKLRMLQLT 132 

Qy 88 ENKISTIERGAFQDLKELERLRLNRNHLQLFPELLFLGTAKLYRLDLSENQIQAIPRKAF 147 

:|:| llll :|lll II MM II : |: I 

Db 133 DNQIHTIERNSFQDLVSLE RLDISNNVITTVGRRVF 168 

Qy 148 RGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVASFNHMPKLRTFRLH 207 

:|| : : : 1 1 1 1 MM::: ||: | : 1 1 : 1 1 1 1 1 1 1 : 1 I I : :|| II 
Db 169 KGAQSLRSLQLDNNQITCLDEHAFKGLVELEILTLNNNNLTSLPHNIFGGLGRLRALRLS 228 

Qy 208 SNNLYCDCHLAWLSDWLRRRPRVGLYTQCMGPSHLRGHNVAEVQKREFVCSDEEEGHQSF 267 

I 11111:111 :ll I: IM II M III:: M II I 

Db 229 DNPFACDCHLSWLSRFLRSATRLAPYTRCQSPSQLKGQNVADLHDQEFKCSGLTE 283 

Qy 268 MAP-SCSVLH-CPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLEQNTIKVIPPGAFS 325 

II I Ml I I:: MM I II :l M MMIII I M M 

Db 284 HAPMECGAENSCPHPCRCADGIVDCREKSLTSVPVTLPDDTTDVRLEQNFITELPPKSFS 343 

Qy 326 PYKKLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNKITELPKSLFEGLFSLQLLLLNA 385 

:::|llllim II :l II II: I Mllllll M Mil IMIIIII 
Db 344 SFRRLRRIDLSNNNISRIAHDALSGLKQLTTLVLYGNKIKDLPSGVFKGLGSLRLLLLNA 403 

Qy 386 NKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLAQNPFICDCHLKWL 445 

IMM 1 1 1 : 1 1 1 : 1 : 1 1 1 1 1 1 1 MM III : : : : : I : III M 1 1 1 1 M M 
Db 404 NEISCIRKDAFRDLHSLSLLSLYDNNIQSLANGTFDAMKSMKTVHLAKNPFICDCNLRWL 463 

Qy 446 ADYLHTNPIETSGARCTSPRRLANKRIGQIKSKKFRCSGTEDYRSKLSGDCFADLACPEK 505 

IMI IIIMIIII IM: M :: MM I I MM I II 

Db 464 ADYLHKNPIETSGARCESPKRMHRRRIESLREEKFKCSWGE-LRMKLSGECRMDSDCPAM 522 

Qy 506 CRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVLEATGIFKKLPQLRKINFSNNK 565 

I IIIIIIM- ::| M II :l II Mil : : M M I I: h 
Db 523 CHCEGTTVDCTGRRLKEIPRDIPLHTTELLLNDNELGRISSDGLFGRLPHLVKLELKRNQ 582 

Qy 566 ITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKGLESLKTLMLRSNRITCVGNDSFIGL 625 

:| II MM : I: I |::: : Ml II III 
Db 583 LTGIEPNAFEGASHIQELQLGENKIKEISNKMFLGLHQLKT 623 

Qy 626 SSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCYLAWLGEWLRKKRIVTGNP 685 

I : I i I M I : I Ml: MhMI : 1 1 1 1 1 1 1 : 1 11 I Ml : I 
Db 624 LNLYDNQISCVMPGSFEHLNSLTSLNLASNPFNCNCHLAWFAECVRKKSLNGGAA 678 
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Qy 686 RCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCPTECTCLDTVVRCSNKGLKVLP 745 

M I ■■■■■■■ l:h :| I I : I II III III II II :| 
Db 679 RCGAPSKVRDVQIKDLPHSEPKCSSENSE-GCLGDGyCPPSCTCTGTWACSRNQLKEIP 737 

Qy 746 KGIPRDVTELYLDGNQFTLVPKE-LSNYKHLTLIDLSNNRISTLSNQSFSNMTQLLTLIL 804 

:lll : :IMh h : I : : : II :|||||:|: III :|:|:|:| |||: 
Db 738 RGIPAETSELYLESNEIEQIHYERIRHLRSLTRLDLSNNQITILSNYTPANLTKLSTLII 797 

Qy 805 SYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGAFNDLSALSHLAIGANPLyCDCNMQ 864 

1 1 1 : 1 : 1 : II :||::|llll ll::lll:l II : 1 : 1 : 1 : 1 : 1 1 1 1 1 1 1 :: 
Db 798 SYNKLQCLQRHALSGLNNLRWSLHGNRISMLPEGSFEDLKSLTHIALGSNPLYCDCGLK 857 

Qy 865 WLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQGPVDVNILAKCNPCLSNPCK 924 

I llhl :| llllllll I :l 111:1:111 I hi I :llllll I II: 
Db 858 WFSDWIKLDYVEPGIARCAEPEQMKDKLILSTPSSSFVCRGRVRNDILAKCNACFEQPCQ 917 

•925 NDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGTCHLKEGEEDGFWCICAD 984 
I 1:1 hi I I: I: I: I II III:: II : II I I II 
918 NQAQCVALPQREYQCIiCQPG YHGKHCEFMIDACYGNPCRNNATCTVL - ■ EEGRFSCQCAP 975 

Qy 985 GFEGENCEVNVDDC-EDNDCENNSTCVDGINNYTCLCPPEYTGELCEERLDFCAQDLNPC 1043 

I: I II Ml : 1 : 1 1 : M r 1 1 : :| | | ::|| |: |: ||: : I 
Db 976 GYTGARCETNIDDCLGEIKCQNNATCIDGVESYKCECQPGFSGEFCDTKIQFCSPEFNPC 1035 

Qy 1044 QHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCICPEGYS 1103 

: :N: : III |: I :| : Ml!:: |:|| I I :| I I II: |: 
Db 1036 ANGAKCMDHFTHYSCDCQAGFHGTNCTDHIDDCQNHMCQNGGTCVDGINDYQCRCPDDYT 1095 

Qy 1104 GLFCEFSP- -PMVLPRTSPCDNFDCQNGA- -QCIVRINEPICQCLPGYQGERCEKLVSVN 1159 

I :l I: hill | :|::| I : :: :|:| ||| |: || | :: 

Db 1096 GKYCEGHNMISMMYPQTSPCQNHECKHGVCFQPNAQGSDYLCRCHPGYTGKWCEYLTSIS 1155 

Qy 1160 FINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELYRGRVRASYDTGS 1219 

I:: I:::: : 11= |:|: :: I :|||:| I hlllh Ihl III |: 
Db 1156 FVHNNSFVELEPLRTRPEANVTIVFSSAEQNGILMYDGQDAHLAVELFNGRIRVSYDVGN 1215 

Qy 1220 HPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLSRQSTLNFDSPLYVGG 1279 

II I :|| I : II :l Mil: :: :| II I : I I I :|:::|| 

Db 1216 HPVSTMYSFEMVAIXsKYBAVELIjiAIKKNFTLRVDRGLARSIINEGSNDYLKLTTPMFLGG 1275 

Qy 1280 MPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQTGILPGCEPCHKKVCAH 1339 
■ :l : : :l III I" ::|| :| II II III 
j Db 1276 LPVDPAQQAYKNWQIRNLTSFKGCMKEVWINHKLVDFGNAQRQQRITPGC ALLE 1329 

W 1340 GTCQPSSQAGFTCECQEGWMG- : PLCDQRTNDPCLGNKCVHGT -CLP-INA-FSYSCKCL 1394 
■ II : :: :| I : MM Ml |: |:| I I III 

Wb 1330 GEQQEEE DDEQDFMDETPHIKEEPVDPCLENKCRRGSRCVPKSNARDGYQCKCK 1383 



Qy 1395 

I Ml: I I 
Db 1384 HGQRGRYCDQGEGSTEP' 



I :| 
■-PTVTAAS-- 



1454 
Ml I:: 
TCRKEQV 1414 



1455 RDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSFECTDGSSFVDEVEK 1514 

1=11 : I:: : : :| III I III :||| I:: :: :: 
1415 REYYTEND — CRSRQPLKYAKCVGGC-GNQCCAAKIVRRRKVRMVCSNNRKYIKNLDI 1469 

1515 WKCGCTR 1522 
I MM: ** 
1470 VRKCGCTK 1477 



RESULT 2 
NOTC.DROME 

ID NOTC_DROME STANDARD; PRT; 2703 AA, 

AC P07207; P04154; 

DT 01-NOV-1986 (Rel. 03, Created) 

dt OI-feb-1996 (Rel. 33, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE NEUROGENIC LOCUS NOTCH PROTEIN PRECURSOR, 

GN N. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 



Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

Ephydroidea; Drosophilidae; Drosophila. 

[1] 

SEQUENCE FROM N. A. 

MEDLINE-86079539; PubMed-3935325; 

Wharton K.A., Johansen K.M., Xu T., Artavanis-Tsakonas S.; 

"Nucleotide sequence from the neurogenic locus notch implies a gene 

product that shares homology with proteins containing EGF-like 



Cell 43:567-581(1985). 
[2] 

SEQUENCE FROM N.A. 
STRAIN-OREGON-R; 

MEDLINE-87064624; PubMed-3097517; 
Kidd S., Kelley-M.R., Young M.W.; 

"Sequence of the notch locus of Drosophila melanogaster; relationship 
of the encoded protein to mammalian clotting and growth factors."; 
Mol. Cell. Biol. 6:3094-3108(1986). 
[3] 

SEQUENCE OF 2505-2611 FROM N.A, 
MEDLINE-85099329; PubHed-2981631; 

Wharton K.A., Yedvobnick B., Finnerty V.G., Artavanis-Tsakonas S,; 
"opa; a novel family of transcribed repeats shared by the Notch locus 
and other developmentally regulated loci in D. melanogaster."; 
Cell 40:55-62(1985). 
W 

SEQUENCE OF 1-8'FROM N.A. 

MEDLINE-87257846; PubMed-3037327; 

Relley M.R., Kidd S., Berg R.L., Young M.S.; 

"Restriction of P-element insertions at the Notch locus of Drosophila 

melanogaster.";' 

Mol. Cell. Biol. 7:1545-1548(1987). 
[5] 

REVIEW. 

Harris W. A.; ' 

"Many cell types specified by Notch function."; 
Curr. Biol. 1:120-122(1991). 

-!- FUNCTION: NOTCH PROTEIN IS ESSENTIAL FOR PROPER DIFFERENTIATION OF 
ECTODERM. ; 

-I- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

-I- MISCELLANEOUS: SEPARATION OF NEUROBLASTS FROM THE ECTODERM INTO 
THE INNER PART OF EMBRYO IS ONE OF THE FIRST STEPS OF CNS 
DEVELOPMENT IN INSECTS, THIS PROCESS IS UNDER CONTROL OF THE 
NEUROGENIC GENES. 

-!- SIMILARITY: HIGH, WITH OTHER NOTCH-TYPE PROTEINS . 

-!- SIMILARITY: CONTAINS 36 EGF-LIKE DOMAINS. 

-!• SIMILARITY:' CONTAINS 3 LIN/NOTCH REPEATS. 

-!- SIMILARITY:' CONTAINS 6 ANK REPEATS. 



This SWISS-PROTl entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; M16152; AAB59220.1; -. 
EMBL; M16153; AAB59220.1; JOINED. 
EMBL; M16149; AAB59220.1; JOINED. 
EMBL; M16150; AAB59220.1; JOINED. 
EMBL; M16151; AAB59220.1; JOINED. 
EMBL; K03508; AAA28725.1; 
EMBL; M13689; AAA28725.1; JOINED. 
EMBL; R03507; AAA28725.1; JOINED. 
EMBL; M12175; AAA74496.1; -. 
EMBL';' M16025; AAA28726.1; -. 
PIR; A24420; A24420. 
PIR; A24768; A24768. 
PIR; A05267; A05267, 
HSSP; P00740; ljlXA. 
FLYBASE; FBgn0004647; N. 
INTERPRO; IPR000152; -. 
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DR 


INTERPRO 


IPR000561; -. 






UlDUbJ! 1U 


166 ' 


175 


DV CTMTT ADTTV 


DR 


INTERPRO, 


IPR0008 


)0; -. 




FT 


nKnrPTn 


181; 


192 


RY CTMTT.ARTTY 
Dl OlMlLnI\ll 1 , 




INTERPRO, 


IPR001438; -. 






riTcnrPTn 

DiOUbJ iU 


186' 




DV CTMTT 1DTTV 
DI SlMlbftKllI . 


DR 


INTERPRO 


IPR001881; -. 




FT 




205' 


214 


RY CTMTT.ARTTY 
Dl OlnlbnAll I , 


DR 


INTERPRO, 


IPR002110; -. 




FT 


DISULFID 


221 : 


232 


BY SIMILARITY, 


DR 


PFAM; PF00008; EG 


\- 36, 




FT 


DISULFID 


226' 


241 


BY SIMILARITY, 


DR 


PFAM; PF00023; ank; 6. 








243 


252 


RV CTMTT.ARTTY 
Dl OlHlLilAll I , 


DR 


PFAM; PF00066; notch; 3. 




FT 


DISULFID 


259: 


270 


BY SIMILARITY. 


DR 


PRINTS; PR00010; EGFBLOOD, 




FT 


DISULFID 


264 • 


279 


BY SIMILARITY, 


DR 


PROSITE; PS50088; ANK.REPEAT; 5. 




VXOVUC 1U 


281' 


290 


BY SIMILARITY, 


DR 


PROSITE; PS50297; ANK_REP_RE 


3ION; 1, 


FT 




297 ! 


308 


BY SIMILARITY, 


DR 


PROSITE; 


PS00010; ASXJYDROXYL; 22, 




nTC.nr.FTn 


302- 


317 


RV CTMTT ARTTY 


DR 


PROSITE; PS00022; EGFJ; 34. 






fiTOnrPTn 
UloULr iU 


319 


328 


DV CTMTT t,DTTV 


DR 


PROSITE; 


PS01186; EGF_2; 28. 




FT 


nTCriTFTn 


335'. 


349 


RV CTMTT ABTTV 


DR 


PROSITE; 


PS01187; 


EGF_CA; 22. 


PT 


n T sriT,PTn 


343 


358 


RV CTMTT.ARTTY 
Dl olfllLnlMl I « 


KW 


Differentiation; Neurogenesis; Repeat; ANK repeat; EGF-like domain; 






360' 


369 


DV CTMTT JVDTTV 
Dl ilMlliftlUlI. 




Transmembrane; Signal; Glycoprotein, 


PT 


nTCFTTPTn 


376- 




DV CTMTT &DTTV 
Dl OlnlLftKllI. 


FT 


SIGNAL 


1 


44 


POTENTIAL. 


PT 


LUiUbf 1U 


381 


396 


DV CTMTT iDTTV 


FT 


CHAIN 


45 


2703 


NEUROGENIC LOCUS NOTCH PROTEIN. 


FT 


nTenTFTn 
UXOVUl iu 


393. 


407 


RV CTMTT.ARTTY 


FT 


DOMAIN 


45 


1745 


EXTRACELLULAR (POTENTIAL). 




nKriT.PTn 

uiouiif iu 


413 • 


424 


RV CTMTT.ARTTY 




TRANSMEM 


1746 


1766 


POTENTIAL. 


FT 


nTqrtTPTn 

LUiUJjf w 


418 : 


435 


RV CTMTT ARTTY 




DOMAIN 


1767 


2703 


CYTOPLASMIC (POTENTIAL) . 


FT 




437 ■ 


446 


RY CTMTT.ARTTY 

DI Dl Jul Unit 11 1 . 




DOMAIN 


58 


1451 


36 X EGF-TYPE REPEATS. 


FT 


nTCTTT.FTn 

LflOULir IJJ 


453 


465 


RV CTMTT.ARTTY 
Dl OlNlbnivll 1 . 


r 


DOMAIN 


58 


95 


EGF-LIKE 1. 




JJASUbr IJJ 


459 


474 


DV CTMTT.1BTTV 
Dl OlnlLnitli I . 




DOMAIN 


96 


136 


EGF-LIKE 2. 


PT 


nTCnruTn 
LuoULrlD 






DV CTMTT 1DTTV 
Dl OlMlLftiUlI . 


FT 


DOMAIN 


139 


176 


EGF-LIKE 3. 


PT 


nTonTnn 
uiaubriu 


492 


503 


DV CTMTT 1DTTV 
Dl OlMlLnKllI . 




DOMAIN 


177 


215 


EGF-LIKE 4. 


PT 


ntoriT pm 






Dl DlMibftKin , 


PT 


DOMAIN 


217 


253 


EGF-LIKE 5, CALCIUM-BINDING (POTENTIAL), 


FT 


lUollbr IU 


514 


523 


13V CTMTT 10TTV 

Dl olMlLnKlll, 


FT 


DOMAIN 


255 


291 


EGF-LIKE 6. 


FT 


Ui&Ubr ijj 


530 


541 


DV CTMTT 1DTTV 
Dl OlMlliAlUlI. 




DOMAIN 


293 


329 


EGF-LIKE 7, CALCIUM-BINDING (POTENTIAL). 


FT 


LUoUbr ID 


535 


550 


DV CTMTT 1DTTV 


FT 


DOMAIN 


331 


370 


EGF-LIKE 8, CALCIUM-BINDING (POTENTIAL). 


PT 


nTCnT pm 






DV CTMTT 1DTTV 
BI OlMlljAIUJ. I , 


FT 


DOMAIN 


372 


408 


EGF-LIKE 9, CALCIUM-BINDING (POTENTIAL). 


FT 


JJldUbr IU 


568 


579 


RY CTMTT.ARTTY 
Dl dlnllintvll I . 


„ 


DOMAIN 


409 


447 


EGF-LIKE 10. 


FT 


LiiDUbr iu 


573 


588 


RY CTMTT.ARTTY 
Dl oli!llljnJ\lll . 




DOMAIN 


449 


486 


EGF-LIKE 11, CALCIUM-BINDING (POTENTIAL), 




nTc.rtT.PTn 


590 


599 


RY CTMTT.ARTTV 
Dl OlHlLnlvll 1 . 




DOMAIN 


488 


524 


EGF-LIKE 12, CALCIUM-BINDING (POTENTIAL). 


PT 


bl SUbC iu 


606 ■ 


616 


RV CTMTT.ARTTV 
01 olnlbnnlll, 




DOMAIN 


526 


562 


EGF-LIKE 13, CALCIUM-BINDING (POTENTIAL). 


PT 


mcnTPTn 
bioubr iv 


611 


625 


DV CTMTT 1UTTV 

Dl olfUbnAllI. 


FT 


DOMAIN 


564 


600 


EGF-LIKE 14, CALCIUM-BINDING (POTENTIAL). 


PT 


uioUbr id 




636 


HV CTMTT 10TTV 
Dl SlnllinKllI. 


FT 


DOMAIN 


602 


637 


EGF-LIKE 15, CALCIUM-BINDING (POTENTIAL). 


FT 


nTOnr pth 
UioUbr ID 


mi 




DV CTMTT BDTTV 
Dl OlMlbflKllI, 


FT 


DOMAIN 


639 


675 


EGF-LIKE 16, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


648 


663 


UV CTMTT 10TTV 
Dl OlMJjnlUlI, 




DOMAIN 


677 


713 


EGF-LIKE 17/ CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


665. 


674 


RV CTMTT ABTTV 
Dl OlMlbAIUlI , 


PT 


DOMAIN 


715 


751 


EGF-LIKE 18, CALCIUM-BINDING (POTENTIAL), 


FT 


DISULFID 


681 


692 


DV CTMTT SDTTV 

di oiniLAKiii . 




DOMAIN 


753 


789 


EGF-LIKE 19, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


686 


701 


DV CTUTT JiDTTV 

di jIMIbAKIll. 


PT 


DOMAIN 


791 


827 


EGF-LIKE 20, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


703 


712 


DV CTMTT iDTTV 

BI olnlbftKlil . 


FT 


DOMAIN 


829 


865 


EGF-LIKE 21, CALCIUM-BINDING (POTENTIAL) . 


FT 


DISULFID 


719' 


730 


DV CTMTT SDTTV 
DI OinibAKllI. 


FT 


DOMAIN 


867 


905 


EGF-LIKE 22. 


FT 


DISULFID 


724 


739 


BY SIMILARITY. 


PT 


DOMAIN 


907 


944 


EGF-LIKE 23, CALCIUM-BINDING (POTENTIAL), 












PT 


DOMAIN 


946 


982 


EGF-LIKE 24, CALCIUM-BINDING (POTENTIAL). 


Query Match 




9.8%; 


Score 818; DB 1; Length 2703; 




DOMAIN 


984 


1020 


EGF-LIKE 25. 


B 


sst Local Similarity 


29.3%; 


Pred. No. 3.7e-43; 


FT 


DOMAIN 


1022 


1058 


EGF-LIKE 26, CALCIUM-BINDING (POTENTIAL), 


Matches 262; 


Conservative 


93; Mismatches 336; Indels 202; Gaps 50; 




DOMAIN 


1060 


1096 


EGF-LIKE 27. 












| 


DOMAIN 


1098 


1134 


EGF-LIKE 28. 


Qy 


707 TCDDGNDDNSCSPLSRCPT ECT 


— CLDTWRCSNKGLKVLPKGIPRDVTELYLDGNQFT 763 


1 


DOMAIN 


1136 


1181 


EGF-LIKE 29. 




II II 


1 :l 


III 1 


1 1 1 h : 1 1 1 


FT 


DOMAIN 


1183 


1219 


EGF-LIKE 30, CALCIUM-BINDING (POTENTIAL) . 


Db 


307 TCIDGISDYTC" 


■ -RCPPNFIGRFCQDDVDECAQRDHPVCQNGATCTNTH 353 




DOMAIN 


1221 


1257 


EGF-LIKE 31, CALCIUM-BINDING (POTENTIAL). 












FT 


DOMAIN 


1259 


1295 


EGF-LIKE 32, CALCIUM-BINDING (POTENTIAL). 


Qy 


764 LVPKELSNYKHLTL IDLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFD 818 




DOMAIN 


1297 


1335 


EGF-LIKE 33. 






:| ■: 


: :| 


II II II 1 


FT 


DOMAIN 


1337 


1373 


EGF-LIKE 34. 


Db 


354 


GSYSGICVNGWAGLDCSNNTDDCKQAACFYGAT CI D 389 


PT 

FT 


DOMAIN 


1375 


1412 


EGF-LIKE 35. 












FT 


DOMAIN 


1415 -1451 


EGF-LIKE 36. 


Qy 


819 GLKSLRLLSLHGNDISWPEGAFNDLSALSHL- -AIGANPLYCD- -CNMQWLSD — WV 870 


FT 


DOMAIN 


1475 


1593 


3 X LIN/NOTCH REPEATS. 


1: 1 


1 




III 1 :ll : 1 1: ■ 




REPEAT 


1475 


1513 


LIN/NOTCH 1. 


Db 


390 GVGSFYCQCTKGK 




- - - TGLLCHLDDACTSNPCHADAICDTSPINGSYACSC 437 


FT 


REPEAT 


1514 


1553 


LIN/NOTCH 2. 














REPEAT 


1554 


1593 


LIN/NOTCH 3. 


Qy 


871 KSEYK 


--EPGIARC-AGPGEMADKLLLTTPSKKFTCQ— -GP-VDVNILAKCNPCL 919 


PT 


DOMAIN 


1896 


2109 


6 X ANK MOTIF REPEATS, 


: II 


1 


1 1 


; 1 1 : 1 II ; II II 


FT 


DOMAIN 


2538 


2568 


POLY-GLN (OPA-REPEAT) . 


Db 


438 ATGYKGVDCSEDIDECDQGSPCEHNGICVNTPGSYRCNCSQGFTGPRCETNI - - - -NECE 493 


FT 


DISULFID 


62 


73 


BY SIMILARITY, 












FT 


DISULFID 


67 


83 


BY SIMILARITY, 


Qy 


920 SNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGTCHLKEGEEDGFW 979 


FT 


DISULFID 


85 


94 


BY SIMILARITY. 




1:11: 


:|:|' 1 


Mill 


II 1 1:: 1 1 III! : III! 1 :ll 


FT 


DISULFID 


100 


111 


BY SIMILARITY. 


Db 


494 SHPCONEGSCLDDPGTF-RCVCMPGFTGTQCEIDIDECQSNPCLNDGTCHDK— INGFK 549 


FT 


DISULFID 


105 


124 


BY SIMILARITY. 












FT 


DISULFID 


126 


135 


BY SIMILARITY. 


Qy 


980 CICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFCAQD 1039 


FT 


DISULFID 


143 


154 


BY SIMILARITY. 




1 II 


1 r:|::|:lll: 


II 1 1 1 hi III III II :: II 


FT 


DISULFID 


148 


164 


BY SIMILARITY. 


Db 


550 CSCALGFTGARCQINIDDCQSQPCRNRG ICHDS I AGY SCECPPGYTGTSCEI NINDC ■ ■ D 607 
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Qy 1040 LNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCICP 1099 

III 1 III III I III I I ::h II: II I I I I I 
Db 608 SNPC-HRGKCIDDVNSFKCLCDPGYTGYICQKQINECESNPCQFDGHCQDRVGSYYCQCQ 666 

Qy 1100 EGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVMNEPICQCLPGYQGERCEKLVSVN 1159 

Mill: : I : MINI II llhlh |: III 
Db 667 AGTSGKNCEVN VNECHSNPCNNGATC IDG INSYKCQCVPGFTGQHCEK 714 

Qy 1160 FINXESYLQIPSAKVFPQTNITLQIAIDEDSGILLYKGDKDHIAVELYRGRVRASYDTGS 1219 

I : : I I t': I: :| || I II II : 

Db 715 --NVDECISSPCA NNGVC IDQVNG - - - YK CECPRG — FYD--A 748 

Qy 1220 HPASAI — YSVETINDGNFHIVELLALDQSLSLSVDGGNPKII — TNLSRQSTLNFD 1272 

II: I :MI II I I |: |: I 

Db 749 HCLSDVDECASNPCVNEGRCE DGINEFICHCPPGYTGRRCELDID 793 

• 1273 — SPLYVGG -MPGKSNVASLRQAPGQNG — TSFHGCIRNLYIN SELQDFQ- 1317 
• :| II II I: II I h 1:1 I :: :: 
794 ECSSNPCQHGG^YDKLNAFSCQCMPGYTGQKCETNIDDCVTNPCGNGGTCIDKVNGYKC 853 

Qy 1318 -KVPMQTGILPGCE— -PCHKKVCAH-GTCQPSSQ-AGFTCECQEGWMGPLCDQRTND 1369 

III 1111111:1 III hi I: I: I lh :: 
Db 854 VCKVPF ■ TG - ■ RDCESRMDPCASNRCRNEAKCTPSSNFLDFSCTCKLGYTGRYCDEDIDE 910 

Qy 1370 PCLGNKCVHG-TCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLG 1428 

I : I :| :|| : II I I :|: I I I I : |::| I hi 
Db 911 CSLSSPCRNGASCIiNVPG-SYRCLCTKGYEGRDCAINTD — DCASFPCQNGGTCLDGIG 966 

Qy 1429 QPYCECSSGYTGDSCDREIS CR-GERIRDYYQKQQ GYAA--CQTTKKVS 1474 

II hi I: :|: I: I I |:: III : 

Db 967 DYSCLCVDGFDGKHCETDINECLSQPCQNGATCSQYVNSYTCTCPLGFSGINCQTNDE- 1024 

Qy 1475 RLEC-RGGCA-GGQCCGPLRSRRRKYSFECTDGSSFVDEVEKWKCGCTRCVS 1525 

:| I II I : I: I I I : I: II I:: 
Db 1025 --DCTESSCLNGGSCIDGING YNCSCLAGYSGANCQYKLNKCDSNPCLN 1071 



RESULT 3 
NOTCJENLA 

ID NOTCJENLA STANDARD; PRT; 2524 AA. 

AC P21783; 

dt 01-MAY-1991 (tel. 18, Created) 

DT 01-OCT-1996 (tel. 34, Last sequence update) 

DT 15-JOL-1998 (tel. 36, Last annotation update) 

« NEUROGENIC LOCUS NOTCH PROTEIN HOMOLOG PRECURSOR (XOTCH PROTEIN) . 
XOTCH. 
Xenopus laevis (African clawed frog). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Amphibia; Batrachia; Anura; Mesobatrachia; Pipoidea; Pipidae; 

OC Xenopodinae; Xenopus. 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=90385285; PubMed-2402639; 

RA Coffman C, Harris H., Kintner C; 

RT "Xotch, the Xenopus homolog of Drosophila notch."; 

RL Science 249:1438-1441(1990). 

RN [2] 

RP REVISIONS TO 1759-1782. 

RA Kintner C; 

RL Submitted (JUN-1996) to the EMBL/GenBank/DDBJ databases. 

CC -!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

CC -I- DEVELOPMENTAL STAGE: EXPRESSED ALMOST UNIFORMLY IN EARLY EMBRYOS. 

CC -!- SIMILARITY: HIGH, WITH OTHER NOTCH-TYPE PROTEINS. 

CC -!- SIMILARITY: CONTAINS 36 EGF-LIKE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 3 LIN/NOTCH REPEATS. 

CC -!- SIMILARITY; CONTAINS 6 ANK REPEATS. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinforaatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed, Usage by and for commercial 



CC entities requires a license agreement (See http://ww.isb-sib.ch/announce/ 
CC or send an email to licenseSisb-sib.ch). 



DR EMBL; M33874; AAB02039.1; 

DR PIR; A35844; A35844. 

DR HSSP; P00740; 1IXA. 

DR INTERPRO; IPR000152 

DR INTERPRO; IPR000561 

DR INTERPRO; IPR000800; 

DR INTERPRO; IPR001438 

DR INTERPRO; IPR0018B1 

DR INTERPRO; IPR002110] 



DR 



DR PFAM; PF00008; EGF; 36 



PFAM; PF00023; ank; 6. 
DR PFAM; PF00066; notch; 3. 
DR PRINTS; PR00010; EGFBLOOD. 
DR PROSITE; PS50088i; ANRJEPEAT; 4. 
DR PROSITE; PS5029.7; ANR_REP_REGION; 1. 
DR PROSITE; PS00010; ASXJYDROXYL; 23. 
DR PROSITE; PS00022; EGF_1 ; 34. 
DR PROSITE; PS01186; EGF_2; 29. 

PROSITE; PS01187; EGF.CA; 21. 
RW Differentiation;* Neurogenesis; Repeat; 
RW Transmembrane; Signal; Glycoprotein. 



K repeat; EGF-like domain; 



FT 


SIGNAL 


r 


19 


POTENTIAL. 


FT 


CHAIN 


20 


2524 


NEUROGENIC LOCUS NOTCH PROTEIN HOMOLOG. 


FT 


DOMAIN 


20 


1728 


EXTRACELLULAR (POTENTIAL) . 




TRANSMEM 


1729 


1750 


POTENTIAL, 


FT 


DOMAIN 


1751 . 


2524 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


20 


57 


EGF-LIKE 1, 


FT 


DOMAIN 


58 


99 


EGF-LIKE 2. 


FT 


DOMAIN 


102 ' 


140 


EGF-LIKE 3. 


FT 


DOMAIN 


141 . 


177 


EGF-LIKE 4. 






179 




EGF-LIKE 5, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


' 217 


254 


EV^B.TTyi? C 

Wjt LiKL O. 


FT 

Jl 


IVMulATW 


io!i 




F^EjTTVF 7 / ,, R^^T^TM-OTMnTM^ , /DIVTTMTTRT \ 

iiit 'LIKE, /, UUX1UM -Hi-WUXNO (TOItNIIAL) , 




iwut&TM 


oil 




w^cttvp O' i"'^/^™^^!!'^!^ , rwii extern t \ 
libr-LlKt o, UUjIIUM-BINDIMj (PUlbNIlAL). 


Dm 


UUMAIN 




370 


EGF-LIKE 9, CALCIUM-BINDING (POTENTIAL). 


pt 


DOMAIN 


371 ' 


409 


EGF'LIKE 10. 




VAMTl TM 

LNJMA1N 






EGF-LIKE 11, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


451 


487 


WU-TTITP 1 ") PM/'TriV-nTMnTHr /DMUJMTTaM 
tljf JjXMj li, UuiLlUM BiHUlWb (rUltlNllnbJ . 


FT 


DOMAIN 


489 


525 


EGF-LIKE 13, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


527 


563 


EGF-LIKE 14, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


565 


600 


EGF-LIKE 15, CALCIUM-BINDING (POTENTIAL) , 


FT 


DOMAIN 


602- . 


638 


EGF -LIRE 16, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


640 


675 


EGF-LIKE 17. 


FT 


DOMAIN 


677 


713 


EGF-LIKE 18, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


715' 


750 


EGF-LIKE 19, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


752. 


788 


EGF-LIKE 20, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


790' 


826 


EGF-LIKE 21, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


828 


866 


EGF-LIKE 22. 


FT 


DOMAIN 


868 


904 


EGF-LIKE 23, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


906 ' 


942 


EGF-LIKE 24, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


944.' 


980 


EGF-LIKE 25, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


982 


1018 


EGF-LIKE 26. 


FT 


DOMAIN 


1020 ; 


1056 


EGF-LIKE 27, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


1058'. 


1094 


EGF-LIKE 28. 


FT 


DOMAIN 


1096' 


1142 


EGF-LIKE 29. 


FT 


DOMAIN 


1144, 


1180 


EGF-LIKE 30, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


1182 


1218 


EGF-LIKE 31, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


1220' 


1264 


EGF-LIKE 32, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


1266 s 


1304 


EGF-LIKE 33. 


FT 


DOMAIN 


1306- 


1346 


EGF-LIKE 34. 


FT 


DOMAIN 


1347 - 


1383 


EGF-LIKE 35. 


FT 


DOMAIN 


1386; 


1424 


EGF-LIKE 36. 


FT 


DOMAIN 


1441 


1560 


3 X LIN/NOTCH REPEATS. 


FT 


REPEAT 


144L ■ 


1478 


LIN/NOTCH 1. 


FT 


REPEAT 


1479. 


1520 


LIN/NOTCH 2. 


FT 


REPEAT 


1521" 


1560 


LIN/NOTCH 3. 


FT 


DOMAIN 


1871 : 


2083 


6 X ANK MOTIF REPEATS. 


FT 


DISULFID 


22. 


35 


BY SIMILARITY. 


FT 


DISULFID 


29' 


45 


BY SIMILARITY. 


FT 


DISULFID 


47'. 


56 


BY SIMILARITY. 
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FT 


DISULFID 


62 


74 


BY SIMILARITY. 


FT 


DISULFID 


1029 


1044 BY SIMILARITY, 


FT 


DISOLFID 


68 


87 


BY SIMILARITY. 


FT 


DISULFID 


1046; 


1055 BY SIMILARITY. 


FT 


DISULFID 


89 


98 


BY SIMILARITY. 


FT 


DISULFID 


1062' 


1073 BY SIMILARITY, 


FT 


DISULFID 


106 


117 


BY SIMILARITY. 


FT 


DISULFID 


1067; 


1082 BY SIMILARITY. 


FT 


DISULFID 


111 


128 


BY SIMILARITY. 


FT 


DISULFID 


1084 


1093 BY SIMILARITY. 


FT 


DISULFID 


130 


139 


BY SIMILARITY. 


FT 


DISULFID 


1100'. 


1121 BY SIMILARITY. 


FT 


DISULFID 


145 


156 


BY SIMILARITY. 


FT 


DISULFID 


1115. 


1130 BY SIMILARITY. 


FT 


DISULFID 


150 


165 


BY SIMILARITY, 


FT 


DISULFID 


1132; 


1141 BY SIMILARITY. 


FT 


DISULFID 


167 


176 


BY SIMILARITY. 


FT 


DISULFID 


1148. 


1159 BY SIMILARITY. 


FT 


DISULFID 


183 


194 


BY SIMILARITY. 


FT 


DISULFID 


1153; 


1168 BY SIMILARITY. 


FT 


DISULFID 


188 


203 


BY SIMILARITY. 


FT 


DISULFID 


1170 


1179 BY SIMILARITY. 


FT 


DISULFID 


205 


214 


BY SIMILARITY. 


FT 


DISULFID 


1186; 


1197 BY SIMILARITY, 


FT 


DISULFID 


221 


232 


BY SIMILARITY. 


FT 


DISULFID 


1191 


1206 BY SIMILARITY, 


FT 


DISULFID 


226 


242 


BY SIMILARITY . 


FT 


DISULFID 


1208' 


1217 BY SIMILARITY. 


FT 


DISULFID 


244 


253 


BY SIMILARITY. 


FT 


DISULFID 


1224. 


1243 BY SIMILARITY, 


FT 


DISULFID 


260 


271 


BY SIMILARITY. 


FT 


DISULFID 


1237. 


1252 BY SIMILARITY, 


FT 


DISULFID 


265 


280 


BY SIMILARITY. 


FT 


DISULFID 


1254 


1263 BY SIMILARITY. 


FT 


DISULFID 


282 


291 


BY SIMILARITY. 


FT 


DISULFID 


1270' 


1283 BY SIMILARITY. 


FT 


DISULFID 


298 


311 


BY SIMILARITY. 


FT 


DISULFID 


1275 


1292 BY SIMILARITY. 


FT 


DISULFID 


305 


320 


BY SIMILARITY. 


FT 


DISULFID 


1294: 


1303 BY SIMILARITY. 


ft! 


DISULFID 


322 


331 


BY SIMILARITY. 


FT 


DISULFID 


1310' 


1321 BY SIMILARITY, 


1 


DISULFID 


338 


349 


BY SIMILARITY. 










I 


DISULFID 


343 


358 


BY SIMILARITY. 


Query Match 




9.14; Score 759; DB 1; Length 2524; 


TI 


DISULFID 


360 


369 


BY SIMILARITY. 


B 


st Local Similarity 24.3%; Pred. No. 1.7e-39; 


FT 


DISULFID 


375 


386 


BY SIMILARITY. 


Matches 245; 


Conservative 114; Mismatches 364; Indels 286; Gaps 


FT 


DISULFID 


380 


397 


BY SIMILARITY. 










FT 


UlbUbf ID 


399 


408 


DV OTUTT IVDTIflV 
SI OlMILAKlll. 


Qy 


644 GAFDTLHSLSTLNLIANPFNCNCYLAWLGEWLRKKRIVTGNP RCQKPYFLKEIPI 698 


FT 


DISULFID 


415 


428 


BY SIMILARITY, 




1 : 


|:|:: 


: 1 1 1 1: :: II :| |: :: 1 


FT 


DISULFID 


422 


437 


BY SIMILARITY, 


Db 


115 GTCELLNSVT-- 


EYKCRCPPGWTGDSCQQADPCASNPCANGGKC - LPFEIQY ICR 166 


FT 


DISULFID 


439 


448 


BY SIMILARITY. 










FT 


DISULFID 


455 


f 466 


BY SIMILARITY. 


Qy 


699 QDVAIQDFTC-DDGND- -DNSCSPLSRCPTE CTCLDTWRCSNKGLKVLPKGIPR 750 


FT 


DISULFID 


460 


475 


BY SIMILARITY. 




II. 


II: II : 1 1 III : : 1 


FT 


DISULFID 


477 


486 


BY SIMILARITY. 


Db 


167 CPPGFHGATCRQDINECSQNPCRNGGQCINEFGSYRCTCQN RFTGR 212 


FT 


DISULFID 


493 


504 


BY SIMILARITY. 










FT 


DISULFID 


498 


513 


BY SIMILARITY. 


iQy 


751 DVTELYLDGNQFTLVPKELSNYKHLTLIDLSNNRISTLSNQSFSNMTQLL 800 


FT 


DISULFID 


515 


524 


BY SIMILARITY. 


1 


: II: 1 


1 1 1 :: : 1 1 1 : : 


FT 


DISULFID 


531 


542 


BY SIMILARITY. 


'ub 


213 NCDEPYVPCN-- 


- • ■ PSPCLNGGTCRQTDDTSYDCTCLPGFSGQNCEENIDDCPSNNCRN 267 


FT 


DISULFID 


536 


551 


BY SIMILARITY. 










FT 


DISULFID 


553 


562 


BY SIMILARITY. 


Qy 


801 "TLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISVVPEGAFNDLSALSHLAIGANPLY 858 


FT 


DISULFID 


,569 


579 


BY SIMILARITY. 


1 : 


1 


1 1 : 1 : : :: ::| 1 II 


FT 


DISULFID 


574 


588 


BY SIMILARITY. 


Db 


268 GGTCVDGVNTYNCQCPPDWTG — QYCTEDVDECQLMPNACQN- -GGTCHNTYGG- -YN 319 


FT 


DISULFID 


590 


599 


BY SIMILARITY. 










FT 


DISULFID 


606 


617 


BY SIMILARITY. 


Qy 


859 CDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQGPVD-VNILAKC-N 916 


FT 


DISULFID 


611 


626 


BY SIMILARITY, 




1 1 


1 : 


: 1 1 h: 1: 1 :| 


FT 


DISULFID 


628 


637 


BY SIMILARITY. 


Db 


320 CVCVNGWTGEDCSENIDDCANAACHSGATCHDRV ASFYCECPHGRTGLLCHLDN 373 


FT 


DISULFID 


644 


654 


BY SIMILARITY. 










FT 


DISULFID 


649 


663 


BY SIMILARITY. 


Qy 


917 PCLSNPCRNDGTCNSDPVDFYR-CTCPYGFKGQDCDVPIHACI-SNPCKHGGTCHLKEG 973 


FT 


DISULFID 


665 


674 


BY SIMILARITY, ' 




1:1111' 


I:::||: llll :|||:|ll 1 1 


FT 


DISULFID 


681 


692 


BY SIMILARITY, 


■ Db 


374 ACISNPCNEGSNCDTNPVNGRAICTCPPGYTGPACNNDVDECSLGANPCEHGGRCTNTLG 433 




DISULFID 


686 


701 


BY SIMILARITY. 










1 


DIoULrlD 


703 


712 


BY SIMILARITY. 


Qy 


974 EEDGFWCICADGFEGEHCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKL 1033 




DISULFID 


719 


729 


BY SIMILARITY, 




1 


1 1 1 


: 1 ll::|::l 1 1 : 1 : 1 1 1 : 1 1 : hi 1 1 1 II : 


FT 


DISULFID 


724 


738 


BY SIMILARITY. 


Db 


434 — SFQCNCPQGYAGPRCEIDVNECLSNPCQHDSTCLDQIGEFQCICMPGYEGLYCETNI 4 9 0 


nl 


HTCnT UJT\ 

DlsUbt ID 


!c!l 


749 


BY alMILARIII. 










FT 


HTOTTT PTn 

DlbUbrlD 


756 


767 


BY SIMILARITY. 


Qy 


1034 DFCAQDLNPCQHDSRCILTPRGFRCDCTPGYVGEHCDIDFDDCQDNRCKNGAHCTDAVNG 1093 


FT 


DISULFID 


761 


776 


BY SIMILARITY. 


1 II 


MM: III hill I: 1 1 MM Mill 1 1 1 


FT 


DIbULtIL) 


778 


787 


BY SIMILARITY. 


Db 


491 DECAS 


■ -NPCLHNGRCIDKINEFRCDCPTGFSGNLCQHDFDECTSTPCKNGAKCLDGPNS 548 


FT 


DISULFID 


794 


805 


BY SIMILARITY. 










FT 


DISULFID 


799 


814 


BY SIMILARITY. 


Qy 


1094 YTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGYQGEKCE 1153 


FT 


DISULFID 


816 


825 


BY SIMILARITY. 




III 1 


l|::| 


II :l II h:l 1 1 III 1 h 


FT 


DISULFID 


832 


843 


BY SIMILARITY. 


Db 


549 YTCQCTEGFTGRHCEQDINECIP--DPCHYGTCKDGIATFT CLCRPGYTGRLCD 600 


FT 


DISULFID 


837 


854 


BY SIMILARITY. 










FT 


DISULFID 


856 


865 


BY SIMILARITY. 


Qy 


1154 KLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLY—KGDKDHIAVELYRGRV 1211 


FT 


DISULFID 


872 


883 


BY SIMILARITY. 




1 II: 


:| 1 II ::| : II 


FT 




877 


892 




Db 


601 


NDINE" 


CLSKPCLNGGQ- -CTDRENGYICTCPKG 631 


FT 


'DISULFID 


894 


903 


BY SIMILARITY. 










FT 


DISULFID 


910 


921 


BY SIMILARITY. 


Qy 


1212 RASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLSKQSTLNF 1271 


FT 


DISULFID 


915 


930 


BY SIMILARITY. 






II : 


: II :l 1 MM: 


FT 


DISULFID 


932 


941 


BY SIMILARITY. 


Db 


632 — TTG 


--VNCET— : KIDDCASNLCDNG--KCIDRIDGYECT-- 664 


FT 


DISULFID 


986 


997 


BY SIMILARITY, 










FT 


DISULFID 


991 


1006 


BY SIMILARITY , 


Qy 


1272 DSPLYVGGMPGKSNVASLRQAPGQNGTSF HGCIRNLYINSELQ 1314 


FT 


DISULFID 


1008 


1017 


BY SIMILARITY. 




1 


1 = : 


1: 1:11: 1 1: II: 


FT 


DISULFID 


1024 


1035 


BY SIMILARITY . 


Db 


665 CEPGYTGKL-CNINIHECDSNPCRNGGTCKDQINGFTCVCPDGYHDHMCL SEVK 717 
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Qy 1315 DFQKVPMQTGILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTND — P 1370 

: I: I II I I: |:|: II III |: I 

Db 718 E CNSNPCIHGACHDGVN-GYKCDCEAGWSGSNCDINNNECESNP 760 

Qy 1371 CL GNKCV-HGTCLPINAFSYSCKCLEG 1396 

!: I I: llll: : III: 

Db 761 CMNGGTCKDMTGAYICTCKAGFSGPNCQTNINECSSNPCLNHGTCID-DVAGYKCNCMLP 819 

Qy 1397 HGGVLCDEEEDLFNPCQAIKCKH-GKCRLSGLGQPY-CECSSGYTGDSCDREIS 1448 

: I :| 1:11 ||: |:|: I : : III |: I :|: ::: 
Db 820 YTGAIC---EAVLAPCAGSPCKNGGRCKESEDFETFSCECPPGWQGQTCEIDMNECVNRP 876 

Qy 1449 CRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQC CGPLRSKRRKYSFEC 1502 

II I II I : I: I I I II I 
Db 877 CRNG ATCQNTNGSYKCNCKPGYTGRNCEMDIDDCQP NPC 915 

• 1503 TDGSSFVDEV EKWKC GCTRCVS 1525 
:| I I : I : :| II II: 

916 HNGGSCSDGINMFFCNCPAGFRGPKCEEDINECASNPCKNGANCTDCVN 964 



RESULT 4 
NTC1JAT 



I 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
•• cc 



STANDARD; 



PRT; 2531 AA. 



iteleostomi; 
Murinae; Rattus. 



NTC1 RAT 
Q07008; 

Q1-N0V-1995 (Rel. 32, Created) 
15-JUL-1999 (Rel. 38, Last sequence update) 
15-JUL-1999 (Rel. 38, Last annotation update) 

NEUROGENIC LOCDS NOTCH HOMOLOG PROTEIN 1 PRECURSOR. 
N0TCH1, 

Rattus norvegicus (Rat). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata, 
Mammalia; Eutheria; Rodentia; Sclurognathi; Muridae 
[1] 

SEQUENCE FROM N.A. 
TISSUE-SCHWANN CELL; 
MEDLINE=92111383; PubMed-1764995; 
Weinmaster G., Roberts V.J., Lerake G.; 
"A homolog of Drosophila Notch expressed during 
development."; 

Development 113:199-205(1991). 
[2] 

REVISIONS TO 1652-1653. 
Weinmaster G.; 

Submitted (APR-1998) to the EMBL/GenBank/DDBJ datt 

-I- FUNCTION: REQUIRED FOR THE CORRECT DIFFERENTIATION OF A NUMBER 



-!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

-I- DEVELOPMENTAL STAGE; IN THE EMBRYO, HIGHEST LEVELS OCCUR BETWEEN 

DAYS 12 AND 14 AND DECREASE RAPIDLY TO MUCH LOWER LEVELS IN THE 

ADULT. 

-!- SIMILARITY; HIGH, WITH OTHER NOTCH-TYPE PROTEINS. 
-!- SIMILARITY: CONTAINS 36 EGF-LIKE DOMAINS. 
-!- SIMILARITY: CONTAINS 3 LIN/NOTCH REPEATS. 
-!- SIMILARITY: CONTAINS 6 ANK REPEATS. 

This SWISS -PROT entry is copyright, It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to licenseGisb-sib.ch). 



EMBL; X57405; CAA40667.1; 
HSSP; P00740; 1I)S. 

IPR000152; -. 

IPR000561; -. 

IPR000800; -. 

IPR001438; -. 

IPR001881; -. 

IPR002049; -. 



INTERPROj 
INTERPRO; 





INTERPRO 


IPR002110; -. 






PFAM; PF00008; EG 


F; 36. 






PFAM; PF00023; ank; 6. 




ns 


PFAM; PF00066; notch; 3. 






PRINTS; PR00010;. EGFBLOOD. 




DR 


PRINTS; PROOOllV EGFLAMININ. 






PROSITE; PS50088; ANKJEPEAT; 4. 


DR 


PROSITE; 


PS50297; ANKJEPJE 


3ION; 1. 


DR 


PROSITE; 


PS00010; ASXJYDROXYL; 22. 




PROSITE; PS00022; 


EGF.1; 35. 




DR 


PROSITE; 


PS01186; 


EGFJ; 26. 






PROSITE; PS01187; EGF.CA; 21. 


KW 


Differentiation;' 


Heurogenesi 


s; Repeat; ANK repeat; EGF-like domain; 


KW 


Transmembrane; Signal; Glycoprotein. 


FT 


SIGNAL 


1 ■ 


18 


POTENTIAL. 


FT 


CHAIN 


19 


2531 


NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 1. 




DOMAIN 


19 , 


1723 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


1724 


1746 


POTENTIAL, 


FT 


DOMAIN 


1747. ; 


2531 


CYTOPLASMIC (POTENTIAL). 


PT 


DOMAIN 


20 


58 


EGF-LIKE 1. 


FT 


DOMAIN 


59 


99 


EGF-LIKE 2. 


pm 

J 


DOMAIN 


102 


139 


EGF*LIKE 3. 




DOMAIN 


140: 


176 


EGF-LIKE 4. 


pip 

J 


DOMAIN 


178 


216 


EGF-LIKE 5, CALCIUM-BINDING (POTENTIAL). 




DOMAIN 


218 : ; 


255 


EGF-LIKE 6. 


J 


DOMAIN 


257 


293 


EGF-LIKE 7, CALCIUM-BINDING (POTENTIAL). 


J 


DOMAIN 


295 


333 


EGF-LIKE 8, CALCIUM-BINDING (POTENTIAL). 




DOMAIN 


335 


371 


EGF-LIKE 9, CALCIUM-BINDING (POTENTIAL). 




DOMAIN 


372 


410 


EGF-LIKE 10. 


FT 


DOMAIN 


412 


450 


EGF-LIKE 11, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


452 . 


488 


EGF-LIKE 12, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


490 


526 


EGF-LIKE 13, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


528 ■ 


564 


EGF-LIKE 14, CALCIUM-BINDING (POTENTIAL). 




DOMAIN 


566 


601 


EGF-LIKE 15, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


603, 


639 


EGF-LIKE 16, CALCIUM-BINDING (POTENTIAL). 




DOMAIN 


641- 


676 


EGF-LIKE 17, CALCIUM-BINDING (POTENTIAL) . 


FT 


DOMAIN 


678 . 


714 


EGF-LIKE 18, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


716 


751 


EGF-LIKE 19, CALCIUM-BINDING (POTENTIAL). 




DOMAIN 


753 • 


789 


EGF-LIKE 20, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


791 


827 


EGF-LIKE 21, CALCIUM-BINDING (POTENTIAL). 


FT 

J 


DOMAIN 


829 • 


867 


EGF-LIKE 22, 




DOMAIN 


869 


905 


EGF-LIKE 23, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


907 : 


943 


EGF-LIKE 24, 


FT 

pi 


DOMAIN 


945. 


981 


EGF-LIKE 25, CALCIUM-BINDING (POTENTIAL). 




DOMAIN 


983 . 


1019 


EGF-LIKE 26. 


FT 
pi 


DOMAIN 


1021 


1057 


EGF-LIKE 27, CALCIUM- BINDING (POTENTIAL). 




DOMAIN 


1059 ' 


1095 


EGF-LIKE 28. 


pm 


DOMAIN 


1097 ■ 


1143 


EGF-LIKE 29. 


FT 


DOMAIN 


1145, 


1181 


EGF-LIKE 30, CALCIUM- BINDING (POTENTIAL). 


FT 


DOMAIN 


1183,.' 


1219 


EGF-LIKE 31, CALCIUM-BINDING (POTENTIAL). 


FT 




1221 


1265 




FT 


DOMAIN 


1267. 


1305 


EGF-LIKE 33. 


FT 


DOMAIN 


1307 • 


1346 


EGF-LIKE 34. 




DOMAIN 


1348'' 


1384 


EGF-LIKE 35. 


FT 


DOMAIN 


1387' 


1426 


EGF-LIKE 36. 




DOMAIN 


1449. 


1462 


CYS-RICH. 


FT 


DOMAIN 


1865 


2076 


6 X ANK MOTIF REPEATS . 


FT 


REPEAT 


1865" 


1910 


ANK MOTIF 1. 


FT 


REPEAT 


1912 


1942 


ANK MOTIF 2. 




REPEAT 


1944, 


1975 


ANK MOTIF 3. 


FT 


REPEAT 


1978' 


2009 


ANK MOTIF 4. 


FT 


REPEAT 


2011,i 


2042 


ANK MOTIF 5. 


FT 


REPEAT 


2044 


2076 


ANK MOTIF 6. 




DISULFID 


24'' 


37 


BY SIMILARITY. 


FT 


DISULPID 


31 


46 


BY SIMILARITY. 


FT 


DISULFID 


48' 


57 


BY SIMILARITY. 


FT 


DISULFID 


63 


74 


BY SIMILARITY. 


FT 


DISULFID 


68. 


87 


BY SIMILARITY. 


FT 


DISULFID 


89 


98 


BY SIMILARITY. 


FT 


DISULFID 


106' 


117 


BY SIMILARITY. 


FT 


DISULFID 


111 


127 


BY SIMILARITY. 


FT 


DISULFID 


129 


138 


BY SIMILARITY. 


FT 


DISULFID 


144' 


155 


BY SIMILARITY. 
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FT 


DISOLFID 


149 


164 


BY SIMILARITY. 


FT 


! DISOLFID 1133, 1142 BY SIMILARITY. 


FT 


DISOLFID 


166 


175 


BY SIMILARITY. 


FT 


, DISOLFID 1149: U60 BY SIMILARITY. 


FT 


DISULFID 


182 


195 


BY SIMILARITY. 


FT 


DISOLFID 1154 1169 BY SIMILARITY. 


FT 


DISOLFID 


189 


204 


BY SIMILARITY. 


FT 


i DISOLFID 1171; 1180 BY SIMILARITY. 


FT 


DISOLFID 


206 


215 


BY SIMILARITY. 


FT 


DISOLFID 1187 1198 BY SIMILARITY. 


FT 


DISOLFID 


222 


233 


BY SIMILARITY. 






FT 


DISOLFID 


227 


243 


BY SIMILARITY. 


i 


ery Match 9.0%; Score 747,5; DB 1; Length 2531; 


FT 


DISOLFID 


245 


254 


BY SIMILARITY. 




st Local Similarity 23.4%; Pred. No. 8.8e-39; 


FT 


DISOLFID 


261 


272 


BY SIMILARITY. 


Matches 229; Conservative 88; Mismatches 258; Indels 403; Gaps 


FT 


DISOLFID 


266 


281 


BY SIMILARITY. 






FT 


DISOLFID 


283 


292 


BY SIMILARITY. 


Qy 


881 RCAGPGEMADKLLLTTPSKK FTC QGPVDVNILAKCNPCLS 920 


FT 


DISOLFID 


299 


312 


BY SIMILARITY. 




1! I Ml I : 1 ||: : II 1 II: 


FT 


DISOLFID 


306 


321 


BY SIMILARITY. 


Db 


56 RCQDPSP CLSTPCKNAGTCYWDHGGIVDYACSCPLGFSGPLCLTPLA--NACLA 108 


FT 


DISOLFID 


323 


332 


BY SIMILARITY. 






FT 


DISOLFID 


339 


350 


BY SIMILARITY. 


Qy 




FT 


DISOLFID 


344 


359 


BY SIMILARITY. 




III: III:- :M' | | || 


FT 


DISOLFID 


361 


370 


BY SIMILARITY. 


Db 


109 NPCRNGGfCDLLTLTEYKCRCPPGWSGKSCQQADPCASNPCANGGQCLPFESSYICGCPP 168 


FT 


DISOLFID 


376 


387 


BY SIMILARITY. 






FT 


DISOLFID 


381 


398 


BY SIMILARITY. 


Qy 


944 GFKGQXDVPlHACISNP- -CKHGGTCHLKEGE 974 


FT 


DISOLFID 


400 


409 


BY SIMILARITY. 


II 1 1 :': 1 II hill III : 1 


FT 


DISOLFID 


416 


429 


BY SIMILARITY. 


Db 


169 GFHGPTCRQDVNECSQNPGLCRHGGTCHNEIGSYRCACRATHTGPHCELPYVPCSPSPCQ 228 


■ 


DISOLFID 


423 


438 


BY SIMILARITY. 






■ 


DISOLFID 


440 


449 


BY SIMILARITY. 


Qy 


975 EDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYT 1025 




DISOLFID 


456 


467 


BY SIMILARITY. 


: 1 Ml hill llhl hh 1111:1 1 I ||||:| 


FT 


DISOLFID 


461 


476 


BY SIMILARITY. 


Db 


229 NGGTCRPTGDTTHECACLPGFAGQNCEENVDDCPGNNCKNGGACVDGVNTYNCRCPPEWT 288 


FT 


DISOLFID 


478 


487 


BY SIMILARITY. 






FT 


DISOLFID 


494 


505 


BY SIMILARITY. 


Qy 


1026 GELCEEKLD 1034 


FT 


DISOLFID 


499 


514 


BY SIMILARITY. 


MM. 


FT 


DISOLFID 


516 


525 


BY SIMILARITY. 


Db 


289 GQYCTEDVDECQLMPNACQNAGTCHNSHGGYNCVCVNGWTGEDCSDNIDDCASAACFQGA 348 


FT 


DISOLFID 


532 


543 


BY SIMILARITY. 






FT 


DISOLFID 


537 


552 


BY SIMILARITY. 


Qy 


1035 F 1035 


FT 


DISOLFID 


554 


563 


BY SIMILARITY. 




FT 


DISOLFID 


570 


580 


BY SIMILARITY. 


Db 


349 TCHDRVASFYCECPHGRTGLLCHLNDACISNPCNEGSNCDTNPVNGKAICTCPRGYTGPA 408 


FT 


DISOLFID 


575 


589 


BY SIMILARITY. 






FT 


DISOLFID 


591 


600 


BY SIMILARITY. 


Qy 


1036 CAQDL NPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCT 1088 


FT 


DISOLFID 


607 


618 


BY SIMILARITY. 




hlh |||:| II; 1 hi 1 II 1 hll ::l hill 


FT 


DISOLFID 


612 


627 


BY SIMILARITY. 


Db 


409 CSQDVDECALGANPCEHAGRCLNTLGSFECQCLQGYTGPRCEIDVNECISNPCQNDATCL 468 


FT 


DISOLFID 


629 


638 


BY SIMILARITY. 






FT 


DISOLFID 


645 


655 


BY SIMILARITY. 


Qy 


1089 DAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDHFDCQNGAQCIVRINEPICQCLPGYQ 1148 


FT 


DISOLFID 


650 


664 


BY SIMILARITY. 


1 : : III. II h:ll : 1 1 : 1 : :|: :||| hll h 


FT 


DISOLFID 


666 


675 


BY SIMILARITY. 


Db 


469 DQIGEFQCICMPGYEGVYCEIN TDECASSPCLHNGRCVDKINEFLCQCPKGFS 521 


FT 


DISOLFID 


682 


693 


BY SIMILARITY. 






FT 


DISOLFID 


687 


702 


BY SIMILARITY. 


Qy 


1149 GEKC- - - -ERLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAV 1204 


FT 


DISOLFID 


704 


713 


BY SIMILARITY. 


II "■ 1 1 II: 1 : : I : : : 1 1 : 


' FT 


DISOLFID 


720 


730 


BY SIMILARITY. 


Db 


522 GHLCQYDVDECASTPCKNGAKCLDGPNT — YTCVCTEGYTGTHCEVDIDECDPDPCHI 577 


' FT 


DISOLFID 


725 


739 


BY SIMILARITY, 






FT 


DISOLFID 


741 


750 


BY SIMILARITY. 


Qy 


1205 ELYRGRVRASYDTGSHPASAI YSVET - IND GNFHIVELLA 1243 


FT 


DISOLFID 


757 


768 


BY SIMILARITY. 


1:11:: 1 : II lh I::: 1 


FT 


DISOLFID 


762 


777 


BY SIMILARITY. 


Db 


578 GLCKDGV-ATFTCLCQPGYTGHHCETNINECHSQPCRHGGTCQDRDNYYLCLCLKGTTGP 636 


™ 


DISOLFID 


779 


788 


BY SIMILARITY. 








DISOLFID 


795 


806 


BY SIMILARITY. 


Qy 


1244 LDQSLSLSVDGGNPKIITNLSKOSTLNFD SPLYVGGMPGKSNVASLRQA 1292 




DISOLFID 


800 


815 


BY SIMILARITY. 




II 1 1 IN 1 1 1 1 1 1: : 


FT 


DISOLFID 


817 


826 


BY SIMILARITY. 


Db 


637 NCE I NLDDCASNPCDSG TCLDK IDGYECACEPGYTGSM-CNVNIDECAGS 685 


FT 


DISOLFID 


833 


844 


BY SIMILARITY. 






FT 


DISOLFID 


838 


855 


BY SIMILARITY. 


Qy 


1293 PGQNGTSFHGCIRNLYINSELQDFQKVPMQTGIL PGC EPCHKKV 1336 


FT 


DISOLFID 


857 


866 


BY SIMILARITY. 


1 II : : II II h 


FT 


DISOLFID 


873 


884 


BY SIMILARITY. 


Db 


686 PCHNGGT CEDGIAGFTCRCPEGYHDPTCLSEVNECNSNP 724 


FT 


DISOLFID 


878 


893 


BY SIMILARITY. 






FT 


DISOLFID 


895 


904 


BY SIMILARITY. 


Qy 


1337 CAHGTCQPSSOAGFTCECQEGWMGPLCDQRTNDPCLGNKCVH-GTCLPINAFSYSCKCLE 1395 


FT 


DISOLFID 


911 


922 


BY SIMILARITY. 


1 II h h hi IN II h 1 1 II: III : : 1 1 1 1 


FT 


DISOLFID 


916 


931 


BY SIMILARITY. 


Db 


725 CIHGACRDGLN-GYKCDCAPGWSGTNCDINNNE-CESNPCVNGGTCKDMTS-GYVCTCRE 781 


FT 


DISOLFID 


933 


942 


BY SIMILARITY. 






FT 


DISOLFID 


987 


998 


BY SIMILARITY. 


Qy 


1396 GHGGVLCDEE EDLFNPCQAIKCKH- 1419 


FT 


DISOLFID 


992 


1007 


BY SIMILARITY, 


Ml' 1 : II ||: 


FT 


DISOLFID 


1009 


1018 


BY SIMILARITY. 


Db 


782 GFSGPNCQTNINECASNPCLNQGTCIDDVAGYKCNCPLPYTGATCEWLAPCATSPCKNS 841 


FT 


DISOLFID 


1025 


1036 


BY SIMILARITY. 






FT 


DISOLFID 


1030 


1045 


BY SIMILARITY. 


Qy 


1420 GKCRLSGLGQFY-CECSSGYTGDSCDREIS CRGERIRDYYQKQQGYAACQTTKK 1472 


FT 


DISOLFID 


1047 


1056 


BY SIMILARITY. 




1 h 1 : : 1 I :|: | :|: :|: 1 1 hll I 


FT 


DISOLFID 


1063 


1074 


BY SIMILARITY. 


Db 


842 GVCKESEDYESFSCVCPTGWQGQTCEIDINECVKSPCR HG-ASCQNTNG 889 


FT 


DISOLFID 


1068 


1083 


BY SIMILARITY. 






FT 


DISOLFID 


1085 


1094 


BY SIMILARITY. 


Qy 
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Qy 1516 VKC GCTRCV 1524 

:| II II 

Db 947 NECATNPCQNGANCTDCV 964 



RESULT , 5 
NTC1JO.USE 

ID NTC1J10USE STANDARD; PRT; 2531 AA. 

AC 001705; 

i DT 01-NOV-1995 (Rel. 32, Created) 

' dt OI-feb-1996 (Rel. 33, Last sequence update) 

DT 15-JUL-1999 (Rel. 38, Last annotation update) 

DE NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 1 PRECURSOR (MOTCH PROTEIN). 

GN NOTCHl OR MOTCH. 

j OS Mus musculus (Mouse) , 

* OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

•Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
[1] 
SEQUENCE FROM N.A. 

RC TISSUE-EMBRYO; 

RX MEDLINE-93194170; PubMed-8449489; 

RA Franco del Amo F., Gendron-Maguire M., Swiatek P.J,, Jenkins N.A., 

RA Copeland H.G., Gridley I.; 

RT "Cloning, analysis, and chromosomal localization of Notch-1, a mouse 

RT homolog of Drosophila Notch."; 

RL Genomics 15:259-264(1993). 

RN [2] 

RP SEQUENCE OF 1551-2170 FROM N.A. 

RC TISSUE=EMBRYQ; 

RX pLINE-93048835; PubMed-1425352; 

RA Franco del Amo F., Smith D.E., Swiatek P.J., Gendron-Maguire M., 

RA Greenspan R.J., fifiMahon A. P., Gridley T.; 

RT "Expression pattern of Motch, a mouse homolog of Drosophila Notch, 

RT suggests an' important role in early postimplantation mouse 

RT development,"; 

RL Development 115:737-744(1992). 

CC -I- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

CC -!- DEVELOPMENTAL STAGE: EXPRESSED ALMOST UNIFORMLY IN EARLY EMBRYOS. 

CC "!- SIMILARITY: CONTAINS 36 EGF-LIKE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 3 LIN/NOTCH REPEATS, 

CC •!- SIMILARITY: CONTAINS 6 ANK REPEATS. 

CC •! -'.SIMILARITY: HIGH, WITH OTHER NOTCH-TYPE PROTEINS. 

CC 

CC This SWISS-PROT entry is copyright, It is produced through a collaboration 

« between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 

, CC modified and this statement is not removed, Usage by and for commercial 

) CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.chj . 

CC 

DR EMBL; Z11886; CAA77941.1; -. 

DR HSSP; P00740; 1IXA. 

DR MGD; MGI:97363; NOTCHl, 

DR INTERPRO; IPR000152; -, 

DR INTERPRO; IPR000561; *. 

DR INTERPRO; IPR000800; -. 

DR INTERPRO; IPR001438; -. 

DR INTERPRO; IPR001881; -. 

DR INTERPRO; IPR002110; -. 

DR PFAM; PF00008; EGF; 35, 

DR PFAM; PF00023; ank; 6. 

• DR PFAM; PF00066; notch; 3. 

' DR PRINTS; PR00010; EGFBLOOD. 

DR PROSITE; PS50088; ANKJEPEAT; 2. 

DR ' PROSITE; PS50297; ANKJEP.REGION; 1. 

DR PROSITE; PS00010; ASXJYDROXYL; 22. 

DR PROSITE; PS00022; EGF.l; 34. 

DR PROSITE; PS01186; EGF_2; 27. 

DR PROSITE; PS01187; EGF_CA; 21. 

KW Differentiation; Neurogenesis; Repeat; ANK repeat; EGF-like domain; 

kw Transmembrane; Signal; Glycoprotein. 
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Query Match 9.0*; Score 745; DB 1; Length 2531; 

Best Local Similarity 24.44; Pred. No. 1.3e-38; 

Matches 238; Conservative 103; Mismatches 309; Indels 326; Gaps 46; 

3y 662 FNCNCYLAWLGEWLRKKRIVTGNPRCQKPYPLKEIPIQDVAIQDFTCDDGNDDNSCSPLS 721 

: I I I I I: II I I :l :l 

Db 202 YRCACCATHTG PHCELPY VPCSPSPCQNG- - -ATCRPTG 237 



Qy 722 RCPTECTCLDTW-- 

II II 



•-RCSNKGLKVLPKGI PRDVTELYL--D 758 

Mill: MM! 



Db 238 DTTHECACLPGFAGQNCEENVDDCPGNNCKNGGACV-DGVNTYNCRCPPEVTGQYCTED 295 

Qy 759 GNQFTLVPRELSNYKHLTLIDLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFD 818 

:: 1:1 I II 
Db 296 VDECQLMPNACQN AGTCHN 314 

Qy 819 GLKSLRLLSLHGNDISWPEGAFNDLSALSHLAIGANPLYCDCNMQWLSDWKSEYKEPG 878 

: hi III: : 

Db 315 • THGGYN CVCVNGWTGEDCSENIDDCA 340 

Qy 879 lARCAGPGEMADKLLLTTPSKKFTCQGPVD-VNILAKC-NPCLSNPCKNDGTCNSDPVDF 936 

II I:: I I: I :| : MM MM: 

Db 341 SAACFQGATCHDRV ASFYCECPHGRTGLLCHLKHACISNPCNEGSNCDTNPVNG 394 

Qy 937 YR-CTCPYGFKG— -QD CDVPIH 955 

I Mil I: I II h: :: 

Db 395 KRICTCPSGYTGPACSQDVDECDLGANRCEHAGKCLNTLGSFECQCLQGYTGPGCEIDVN 454 

Qy 956 ACISNPCKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINN 1015 

lllll:: ,|| : I I III |:|| ||:| |:| : | :| |:| |: 
Db 455 ECISNPCQNDATCLDQIGE- - -FQCICMPGYEGVYCEINTDECASSPCLHNGHCMDKIHE 511 

Qy 1016 YTCLC PPE YTG ELCEEKLDFCAQDLNPCQ HDS KC I LT PKGFKCDCTPG YVGEHCDIDFDD 1075 

: I II : I II: :l II IM M: I : I II II I ll-l I: 
Db 512 FQCQCPKGFNGHLCQYDVDECAS—TPCKNGAKCLDGPNTYTCVCTEGYTGTHCEVDIDE 569 

Qy 1076 CQDNKCKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVR 1135 

I : I Ml I I :|hl Ml II : I : hi I I 

Db 570 CDPDPCHYGS-CKDGVATFTCLCQPGYTGHHCE T NI NECHSQPCRHGGTCQDR 621 

Qy 1136 INEPICQCLPGYQGEKCEKLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLY 1195 

I :| II f I II II : I IIM 
Db 622 DNSYLCLCLKGTTGPNCE INLDDCASNPC DSGTCLD 657 

Qy 1196 KGDKDHIAVEL-YRGRVRASYDTGSHPASAIYSVETINDG NFHIVELLAL 1244 

II III::: I :: I II :| 

Db 658 KIDGYECACEPGYTGSM-CNVNIDECAGSPCHNGGTCEDGIAGFTCRCPEGYH 709 

Qy 1245 DQSLSLSVDGGNPKIITNLSKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCI 1304 

:| Ml: : | :| | | ||| :|| 
Db 710 ,-DP— TCLSEVNECN-SNPCIHGACRDGLNGYKCDCAPGWSGT 748 

Qy 1305 RNLYINSELQDFQKVPMQTGILPGCEPCHKKVCAH-GTCQPSSQAGFTCECQEGWMGPLC 1363 

III: I M III: : M I Mh II I 

Db 749 -HCDIHN— :-. NECESNPCVNGGTCKDMT * SGYVCTCREGFSGPNC 788 

Qy 1364 DQRTNDPCLG'NKCVB-GTCLPINAFSYSCRCLEGHGGVLCDEEEDLFNPCQAIKCKH-GK 1421 

I: I I h: III: : I I I M I I : II IIM 
Db 789 QTNINE -CASNPCLNQGTCID - DVAGYKCNCPLPYTGATC - ■ -EWLAPCATSPCKNSGV 843 

Qy 1422 CRLSGLGQPY-iCECSSGYTGDSCDREIS CRGERIRDYYQKQQGYAACQTTKKVS 1474 

IM ' : :'.| I :|: I :|: :|: II I Ml I 

Db 844 CKESEDYESFSCVCPTGWQGQTCEVDINECVKSPCR HG-ASCQNTNGSY 891 

Qy 1475 RLECRGGCAGGQC CGPLRSKRRKYSFECTDG- -SSFVDEV EKWK 1517 

I IM II II Ml MM I : : 

Db 892 RCLCQAGYTGRNCESDIDDCRPNPCHN- ■ -GGSCTDGINTAFCDCLPGFQGAFCEEDINE 948 



Qy 1518 C- 
Db 



■GCTRCV 1524 
I . II II 

949 CASNPCQNGANCTDCV 964 



RESULT 6 
NTC1J0MAN 

ID NTClJOMAN STANDARD; PRT; 2444 AA. 
AC P46531; 

DT 01-NQV-1995 (Rel. 32, Created) 

DT 01-NOV-1995 (Rel. 32, Last sequence update) 

DT 01-FEB-1996 (Rel. 33, Last annotation update) 

NEUROGENIC LOCO? NOTCH PROTEIN HOMOLOG 1 PRECORSOR (TRANSLOCATION- 
DE ASSOCIATED NOTCH PROTEIN TAN-1) (FRAGMENT). 
GN NOTCH1 OR TAN1. 
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Mammalia ; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
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MEDLINE-91347367; PubMed-1831692; 
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Ellisen L.W., Bird J,, West D.C., Soreng A.L., Reynolds T.C., 
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Smith S.D., Sklar J,; 
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"TAN-1, the human homoloo; of the Drosophila notch gene, is broken by 
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chromosomal translocations in T lymphoblastic neoplasms."; 
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Cell 66:649-661(1991). 
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•!■ FUNCTION: MAY BE IMPORTANT FOR NORMAL LYMPHOCYTE FUNCTION. IN 
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IN SOME T-CELL NEOPLASMS. 
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-!■ SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 
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■!■ TISSUE SPECIFICITY: IN FETAL TISSUES MOST ABUNDANT IN SPLEEN, 
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-!• SIMILARITY: HIGH, WITH OTHER NOTCH-TYPE PROTEINS. 
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-!- SIMILARITY: CONTAINS 36 EGF-LIKE DOMAINS. 
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DOMAIN 


1902 


1905 


POLY-GLU. 


DR 


INTERPRO; IPR000152; -. 


FT 


DOMAIN 


2260. 


2263 


POLY-GLY. 


DR 


INTERPRO; IPR000561; -. 


FT 


DOMAIN 


2404 


2407 


POLY-GLN. 


DR 


INTERPRO; IPROQQ809; -. 


FT 


DOMAIN 


2411 


2418 


POLY -PRO. 


DR 


INTERPRO; IPR001881; -. 


FT 


DISULFID 


24 


37 


BY SIMILARITY. 


DR 


INTERPRO; IPRQ02110; -. 


FT 


DISULFID 


31' 


46 


BY SIMILARITY. 


DR 


PFAM; PF00Q08; EGF; 36. 


FT 


DISULFID 


48 


57 


BY SIMILARITY. 


DR 


PFAM; PF00023; ank; 6, 


FT 


DISULFID 


63- 


74 


BY SIMILARITY. 


DR 


PFAM; PF00066; notch; 3. 


FT 


DISULFID 


68' 


87 


BY SIMILARITY. 


j DR 


PROSITE; PS50Q88; ANK.REPEAT; 4, 


FT 


DISULFID 


89- 


98 


BY SIMILARITY. 


1 DR 


PROSITE; PS50297; ANKJEPJEGION; 1. 


FT 


DISULFID 


106 


117 


BY SIMILARITY. 


1 DR 


PROSITE; PS0001Q; ASXJYDROXYL; 20. 


FT 


DISULFID 


111 


127 


BY SIMILARITY. 




PROSITE; PS00022; EGF J; 34. 


FT 


DISULFID 


129' 


138 


BY SIMILARITY. 




PROSITE; PS01186; EGF J; 26. 


FT 


DISULFID 


144 ; 


155 


BY SIMILARITY. 




PROSITE; PS01187; EGF CA; 18. 


FT 


DISULFID 


149 


164 


BY SIMILARITY. 


1 KW 


Differentiation; Neurogenesis; Repeat; ANK repeat; EGF-like domain; 


FT 


DISULFID 


166' 


175 


BY SIMILARITY. 


I KW 


Transmembrane; Signal; Glycoprotein. 


FT 


DISULFID 


182 


195 


BY SIMILARITY. 


1 FT 


SIGNAL 1 18 POTENTIAL. 


FT 


DISULFID 


189*. 


204 


BY SIMILARITY. 


FT 


CHAIN 19 >2444 NEUROGENIC LOCUS NOTCH PROTEIN HOMOLOG 1, 


FT 


DISULFID 


206' 


215 


BY SIMILARITY. 


FT 


DOMAIN 19 1736 EXTRACELLULAR (POTENTIAL). 


FT 


DISULFID 


222 


233 


BY SIMILARITY. 


FT 


TRANSMEM 1737 1757 POTENTIAL, 


FT 


DISULFID 


227 


243 


BY SIMILARITY. 


FT 


DOMAIN 1758 >2444 CYTOPLASMIC (POTENTIAL). 


FT 


DISULFID 


245> 


254 


BY SIMILARITY. 


FT 


DOMAIN 20 58 EGF-LIKE 1. 


FT 


DISULFID 


261 


272 


BY SIMILARITY, 


FT 


DOMAIN 59 99 EGF-LIKE 2. 


FT 


DISULFID 


266"- 


281 


BY SIMILARITY. 


FT 


DOMAIN 102 139 EGF-LIKE 3. 


FT 


DISULFID 


283 


292 


BY SIMILARITY. 


FT 


DOMAIN 140 176 EGF-LIKE 4, 


FT 


DISULFID 


299, 


312 


BY SIMILARITY. 


FT 


DOMAIN 178 216 EGF-LIKE 5, CALCIUM-BINDING (POTENTIAL). 




DISULFID 


306 


321 


BY SIMILARITY. 


FT 


DOMAIN 218 255 EGF-LIKE 6. 


FT 


DISULFID 


323 


332 


BY SIMILARITY. 


FT 


DOMAIN 257 293 EGF-LIKE 7, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


339- 


350 


BY SIMILARITY. 


FT 


DOMAIN 295 333 EGF-LIKE 8, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


344'. 


359 


BY SIMILARITY. 


FT 


DOMAIN 335 371 EGF-LIKE 9, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


361 


370 


BY SIMILARITY. 


FT 


DOMAIN 372 410 EGF-LIKE 10. 


FT 


DISULFID 


376- 


387 


BY SIMILARITY. 


FT 


DOMAIN 412 450 EGF-LIKE 11, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


381- 


398 


BY SIMILARITY, 


FT 


DOMAIN 452 488 EGF-LIKE 12, CALCIUM-BINDING (POTENTIAL), 


FT 


DISULFID 


400 


409 


BY SIMILARITY. 


FT 


DOMAIN 490 526 EGF-LIKE 13, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


416' 


429 


BY SIMILARITY. 


FT 


DOMAIN 528 564 EGF-LIKE 14, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


423 


438 


BY SIMILARITY, 


FT 


DOMAIN 566 601 EGF-LIKE 15, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


440 


449 


BY SIMILARITY. 


FT 


DOMAIN 603 639 EGF-LIKE 16, CALCIUM-BINDING (POTENTIAL), 


FT 


DISULFID 


456 


467 


BY SIMILARITY, 


FT 


DOMAIN 641 676 EGF-LIKE 17, CALCIUM-BINDING (POTENTIAL), 


FT 


DISULFID 


461 


476 


BY SIMILARITY. 


FT 


DOMAIN 678 714 EGF-LIKE 18, CALCIUM-BINDING (POTENTIAL), 


FT 


DISULFID 


478. 


487 


BY SIMILARITY, 



5 (POTENTIAL). 

; (POTENTIAL). 

! (POTENTIAL). 

; (POTENTIAL). 



! (POTENTIAL). 
! (POTENTIAL). 



Mon Jan 22 13:04:54 2001 



us-09-540-; 



■245a-2.rsp 



Page 12 







494 


505 


DV OTMTT HDTTV 

til slHIiiAKJ.li. 






499 




DV OTMTT liBTinV 

di blMlUYKJ.Il , 


PT 




516 


525 


DV OTMTT JVDT'PV 

DI blMlliAKlll. 




mom pin 


532 




DV OTMTT ADTIT1V 

DI SIMILARITY. 


PT 


rtTcrrr PTr\ 


^4 


W1 


BY SIMILARITY. 


PT 


mom nn 




563 


DV OTMTT 1\DTfTW 

Ol ilMlLAKlil . 




mem pth 


tin 
570 


580 


BY SIMILARITY. 


FT 


mem nn 




589 


BY SIMILARITY. 


FT 


DISULFID 


591 


600 


BY SIMILARITY. 


FT 


mOfTT DTft 


607 


618 


BY SIMILARITY. 




mom Dm 


612 


627 


BY SIMILARITY. 


PT 


HTCFTT PTH 

UlaUJjtlL) 


629 


638 


BY SIMILARITY. 


FT 


DISULFID 


645 


655 


BY SIMILARITY. 


FT 


DISULFID 


650 


664 


BY SIMILARITY. 




mom Din 


666 


675 


BY SIMILARITY. 


PT 




682 


693 


BY SIMILARITY. 


™ 


DISULFID 


687 


702 


BY SIMILARITY. 


FT 


riTCnrPTri 

UlOULtlU 






TJV OTMTT HBTrnv 

Bi SIMILARITY, 


pt 


mcmPTn 


?in 


730 


BY SIMILARITY. 




mcmPTn 




739 


BY SIMILARITY. 


FT 


IMCfTT PTr\ 




750 


BY SIMILARITY. 


I 


nTCFTTPTn 


757 


III 


DV OTUTT JlDTraV 

DI blMlLAKin. 


1 






777 


BY SIMILARITY. 




mom pm 


779 


788 


BY SIMILARITY. 




HTCfTT PTH 




806 


BY SIMILARITY, 


PT 


HTCtTTPTn 
UloULrlU 


800 


815 


DV OTMTT fcDTIYlV 

BI blMILAKITl. 


™ 


mom pth 


817 




BY SIMILARITY. 




DTCrTT.PTn 


833 


844 


DV OTMTT IkDTTV 

di alMllAKUi, 


FT 


nTcnr.PTn 

LUOUIji: IV 


838 


855 


DV OTMTT liDTTiV 
DI OlMliAKllI, 


FT 




857 


867 


DV CTVTT1PTTV 
DI OlMlliAKlll , 






874 


885 


DV OTMTT iDTIflV 

BI dIMUjAKITI . 


PT 




879 


894 


DV OTMTT fcDTflV 

di blMliiAKIli. 


FT 




896 


905 


DV OTMTT ADTTV 

BI blMlbAKln. 


FT 


DISULFID 


912 


923 


DV OTMTT &DTTV 
DI OlnlLl/iKill , 




mom pth 


917 


932 


BY SIMILARITY. 


PT 


DISULFID 


934 


943 


BY SIMILARITY, 


FT 


HTCrTT VJT\ 

DloULrlD 


988 


999 


BY SIMILARITY, 


FT 


DISULFID 


993 


1008 


BY SIMILARITY. 


FT 


DISDLFID 


1010 


1019 


BY SIMILARITY. 


FT 


DISULFID 


1026 


1037 


BY SIMILARITY, 


FT 


DISULFID 


1031 


1046 


BY SIMILARITY! 


FT 


DISULFID 


1048 


1057 


BY SIMILARITY. 


FT 


DISULFID 


1064 


1075 


BY SIMILARITY. 


FT 


DISULFID 


1069 


1084 


BY SIMILARITY . 


FT 


DISULFID 


1086 


1095 


BY SIMILARITY. 


FT 


DISULFID 


1102 


1123 


BY SIMILARITY. 


. Query Match 




8.8%; 


Score 735,5; 



3 1; Length 2444; 
Best Local Similarity 25,1%; Pred. No. 4.7e-38; 

•Matches 240; Conservative 102; Mismatches 329; Indels 285; Gaps 49; 
662 FNCNCYLAWLGEWLSKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLS 721 
Ml II |::|l I I :| :| I 

Db 202 YRCVC RATHTGPNCERPY VPCSPSPCQNG— GTCRPTG 237 

Qy 722 RCPTECTCLDTW RCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPK 767 

II II I I I I II 
Db 238 DVTHECACLPGFTGQNCEENIDDCPGNNCKNGGACV DG 275 

Qy 768 ELSNYKHLTLIDLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLS 827 

: :|| I I : I : : 
Db 276 VNTTN— CPCPPEWTG— -QYCT 293 

Qy 828 LHGNDISVVPEGAFNDLSALSRLAIGMPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGE 887 

" ::l I II III: : I I 
Db 294 EDVDECQLMPNACQN--GGTCHNTHGG--YNCVCVNGWTGEDCSENIDDCASAACFHGAT 349 

Qy 888 MADKLLLTTPSKKFTCQGPVD - VNI LAKCN- PCLSNPCKNDGTCNSDPVDFYR -CTCPYG 944 

I:: 11:1 :| I hllll |:::||: INI I 

Db 350 CHDRV ASFYCECPHGRTGLLCHLNDACISNPCNEGSNCDTNPVNGKAICTCPSG 403 

Qy 945 FKGQDCDVPIHACI--SNPCKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDND 1002 

: I I : I :lll:l II I I I I |: I lh:|::| I 



Db 404 YTGPACSQDTOECSLGANPCEHAGKCINTLG— SFECQCLQGYTGPRCEIDVKECVSNP 460 

Qy 1003 CENNSTCVDGINNYTCLCPPEYTGELCEERLDFCAQDLNPCQHDSKCILTPKGFKCDCTP 1062 

|:|::N:| I : |:| I I I II III :|| |: :|: |:|:| 
Db 461 CQNDATCLDQIGEFQCMCMPG YEGVHCEVNTDEC AS • ■ SPCLHNGRCLDK I NEFQCECPT 518 

Qy 1063 GYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCD 1122 

I: I I I 1:1 Mill I I I M!:l III: I I II 

Db 519 GFTGHLCQYDVDECASTPCKNGAKCLDGPNTYTCVCTEGYTGTHCEVDIDECDP--DPCH 576 



Db 



1123 NFDCQNGAQCIVRINEPICQCLPGYQGEKCEKLVSVNFINKE SYLQI 1169 

|::| ' I I III I II ::| : : :|| 

577 YGSCKDGVATFT CLCRPGYTGHHCE* -TNINECSSQPCRLRGTCQDPDNAYL- ■ 626 



Qy 1170 PSAKVRPQTNl'TLQ I AIDE DSGILLYKGDKDHIAVELYRGRVRASYDTGSHPAS 1223 

:: I ; :| I: III I I I I I I III I 

Db 627 -CFCLKGTTGPNCEINLDDCASSPCDSGTCLDKIDGYECACE PGY-TGSMCNS 677 

Qy 1224 AI YSVETINDG NFHIVELLALDQSLSLSVDGGNPKI ITNLSK 1265 

I :': I II :| :l I l|: 

Db 678 NIDECAGNPCHNGGTCEDGINGFTCRCPEGYH DP — TCLSE 716 

Qy 1266 QSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQTGI 1325 

: I :| I I II :|| I ||: 
Db 717 VNECN-SNPCVHGACRDSLNGYKCDCDPGWSGT NCDINN 754 

Qy 1326 LPGCEPCHKCTCAH-GTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVH-GTCLP 1383 

I 'I : III: : :l I hlh III h I I I:: III: 
Db 755 — NECESNPCVNGGTCKDMT-SGIVCTCREGFSGPNCQTNINE-CASNPCLNKGTCID 808 

Qy 1384 INAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKH-GOLSGLGQPY-CEC-SSGYTG 1440 

: I I II : II I : II I:: hi I : : I I ::| I 
Db 809 -DVAGYKCNCLLPYTGATC- -EWLAPCAPSPCRNGGECRQSEDYESFSCVCPTAGAKG 864 

Qy 1441 DSCDREIS CRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQC C 1488 

:|: :|: II I hi I I |: I :| I 

Db 865 QTCEVDINECVLSPCR HG-ASCQNTHGXYRCHCQAGYSGRNCETDIDDC 912 

Qy 1489 GPLRSKRRKYSFECTDG ■ ■ SSFVDEV EKWKC GCTRCV 1524 

I ! MM ::| I : I : :| II II 

Db 913 RPNPCHN- ■ -GGSCTDGINTAFCDCLPGFRGTFCEEDINECASDPCRNGANCTDCV 965 



RESULT 7 
N0TCJ3RARE 

ID NOTCJRARE • STANDARD; PRT; 2437 AA. 

AC P46530; 

DT Ql-NOV-1995 (Rel. 32, Created) 

DT 01-NOV-1995 (Rel. 32, Last sequence update) 

DT 15-JUL-1998 (Rel, 36, Last annotation update) 

DE NEUROGENIC LOCOS NOTCH HOMOLOG PROTEIN PRECURSOR. 

GN NOTCH. 

OS Brachydanio rerio (Zebrafish) (Zebra danio). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Actinopterygii; Neopterygii; Teleostei; Euteleostei; Ostariophysi; 

OC Cypriniforaes; Cyprinidae; Rasborinae; Danio. 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE-EMBRYO; ' 

RX MEDLINE-94128602; PubMed-8297791; 

RA Bierkamp C, Campos -Ortega J. A.; 

RT "A zebrafish homologue of the Drosophila neurogenic gene Notch and 

RT its pattern of transcription during early embryogenesis,"; 

RL Mech, Dev. 43:87-100(1993). 

CC •!■ FUNCTION: IMPLICATED IN CELL FATE SPECIFICATIONS DURING 

CC EMBRYO DEVELOPMENT. MAY BE INVOLVED IN THE FORMATION OF THE 

CC NEURAL PLATE, NOTOCHORD AND BRAIN VESICLES. 

CC -!• SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

CC ■!■ DEVELOPMENTAL STAGE: EXPRESSED IN ALL CELLS IN PREGASTRULATION 

CC STAGES. DURING GASTRULATION IS DIFFERENTIALLY EXPRESSED, 

CC ACCUMULATING PREDOMINANTLY IN THE PRECHORDAL MESODERM AND 

CC NOTOCHORD. AT THE END OF GASTRULATION, EXPRESSED ALONG THE 

CC ANTERIOR- POSTERIOR AXIS INCLUDING THE DEVELOPING NEURAL PLATE 
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cc 


AND DIFFERENTIATING MESO 


DERM. ALSO PRESENT IN THE DEVELOPING 


FT 


DOMAIN 


1181 


1217 


EGF-LIKE 31, CALCIUM-BINDING (POTENTIAL) 


cc 


BRAIN AND HE 


AD REGIONS. 




FT 


DOMAIN 


1219 


1263 


EGF-LIKE 32, CALCIUM-BINDING (POTENTIAL) 


cc 


-!- SIMILARITY: HIGH, WITH 0 


THER NOTCH-TYPE PROTEINS. 


FT 


DOMAIN 


1265 


1303 


EGF-LIKE 33. 


cc 


-1- SIMILARITY: CONTAINS 36 EGF-LIKE DOMAINS. 


FT 


DOMAIN 


1305 


1344 


EGF-LIKE 34. 


cc 


-!• SIMILARITY: CONTAINS 3 LIN/NOTCH REPEATS. 


FT 


DOMAIN 


1346 


1382 


EGF-LIKE 35. 


cc 


-!• SIMILARITY: CONTAINS 6 ANK REPEATS. 


FT 


DOMAIN 


1385^ 


1423 


EGF-LIKE 36. 


cc 










FT 


DOMAIN 


1446 ; 


1561 


3 X LIN/NOTCH REPEATS, 


cc 


This SWISS -PROT entry is copyright. It is produced through a collaboration 


FT 


REPEAT 


1446 


1486 


LIN/NOTCH 1. 


cc 


between 


the Swiss Institute of Bioinformatics and the EMBL outstation - 


FT 


REPEAT 


1487 


1520 


LIN/NOTCH 2. 


cc 


the European Bioinformatics Institute. There are no restrictions on its 


FT 


REPEAT 


1521 


1561 


LIN/NOTCH 3. 


cc 


use by 


non-profit institutions as long as its content is in no way 


FT 


DOMAIN 


1861 


2074 


6 X ANK MOTIF REPEATS. 


cc 


modified and this statement is not removed. Osage by and for commercial 


FT 


REPEAT 


1861 


1891 


ANK MOTIF 1. 


cc 


entities 


require 


s a license 


agreement (See http://www.isb-sib.ch/announce/ 


FT 


REPEAT 


1892 


1940 


ANK MOTIF 1 . 


cc 


or send an email to licensed 


isb-sib.ch). 


FT 


REPEAT 


1941' 


1974 


ANK MOTIF 1 . 


cc 










FT 


REPEAT 


1975 ■ 


2007 


ANK MOTIF 1. 


DR 


3MB L; X6 


088; CAA48831.1; -. 




FT 


REPEAT 


2008 


2040 


ANK MOTIF 1, 


DR 


HSSP; P00740; 1IXA. 




FT 


REPEAT 


2041 i 


2074 


ANK MOTIF 1, 


i 


ZFIN; ZD 


■GENE-990415-173; NOTCH. 


FT 


DOMAIN 


2265 


2276 


POLY-GLN (OPA-REPEAT). 


■ 


INTERPRO 


IPR000152; -. 




FT 


DISULFID 


25 ; 


35 


BY SIMILARITY. 


p 


INTERPRO 


IPR000561; -. 




FT 


DISULFID 


29 


45 


BY SIMILARITY. 


DR 


INTERPRO 


IPR00C 


800; -. 




FT 


DISULFID 


47. 


56 


BY SIMILARITY, 


DR 


INTERPRO 


IPR001336; 




FT 


DISULFID 


62; 


73 


BY SIMILARITY, 


DR 


INTERPRO 


IPR0Q1438; ■. 




FT 


DISULFID 


67 


86 


BY SIMILARITY. 


DR 


INTERPRO 


IPR001881; -. 




FT 


DISULFID 


88> 


97 


BY SIMILARITY. 


DR 


INTERPRO 


IPR002110; -. 




FT 


DISULFID 


105' 


116 


BY SIMILARITY. 


DR 


PFAM; PF 


0008; EGF; 36. 




FT 


DISULFID 


110 


126 


BY SIMILARITY. 


DR 


PFAM; PF00023; ank; 6. 




FT 


DISULFID 


128 1 


137 


BY SIMILARITY. 


DR 


PFAM; PF00066; notch; 3. 




FT 


DISULFID 


143 ; 


154 


BY SIMILARITY. 


DR 


PRINTS; PR00009; EGFTGF. 




FT 


DISULFID 


148' 


163 


BY SIMILARITY. 


DR 


PRINTS; PR00010; EGFBLOOD. 




FT 


DISULFID 


165 


174 


BY SIMILARITY, 


DR 


PROSITE; PS50088; ANK.REPEAT; 4. 


FT 


DISULFID 


181' 


194 


BY SIMILARITY. 


DR 


PROSITE; PS50297; ANK_REP_REGION; 1. 


FT 


DISULFID 


188 


203 


BY SIMILARITY. 


DR 


PROSITE; 


PS00010; ASXJYDROXYL; 23. 


FT 


DISULFID 


205 


214 


BY SIMILARITY. 


DR 


PROSITE; PS00022; B6FJL; 34, 




FT 


DISULFID 


221 


232 


BY SIMILARITY. 


DR 


PROSITE; PS01186; EGF.2; 28. 




FT 


DISULFID 


226 


242 


BY SIMILARITY. 


DR 


PROSITE; PS01187; EGF.CA; 22. 


FT 


DISULFID 


244 


253 


BY SIMILARITY. 


KW 


Differentiation, 


Neurogenesis; Repeat; ANK repeat; EGF-like domain; 


FT 


DISULFID 


260. 


271 


BY SIMILARITY. 


KW 


Transmembrane; Signal; Glycoprotein. 


FT 


DISULFID 


265 


280 


BY SIMILARITY. 


FT 


SIGNAL 


1 


20 


POTENTIAL. 


FT 


DISULFID 


282' 


291 


BY SIMILARITY. 


FT 


CHAIN 


21 


2437 


NEUROGENIC LOCOS NOTCH HOMOLOG PROTEIN. 


FT 


DISULFID 


298" 


311 


BY SIMILARITY, 


FT 


DOMAIN 


21 


1724 


EXTRACELLULAR (POTENTIAL), 


FT 


DISULFID 


305 


320 


BY SIMILARITY. 


FT 


TRANSMEM 


1725 


1747 


POTENTIAL. 


FT 


DISULFID 


322 


331 


BY SIMILARITY. 


FT 


DOMAIN 


1748 


2437 


CYTOPLASMIC (POTENTIAL) , 


FT 


DISULFID 


338: 


349 


BY SIMILARITY, 


FT 


DOMAIN 


21 


* 57 


EGF-LIKE 1. 


FT 


DISULFID 


343 


358 


BY SIMILARITY. 


FT 


DOMAIN 


58 


98 


EGF-LIKE 2. 


FT 


DISULFID 


360 


369 


BY SIMILARITY. 


JT 


DOMAIN 


101 


138 


EGF-LIKE 3. 


FT 


DISULFID 


375 


386 


BY SIMILARITY. 


1 


DOMAIN 


139 


175 


EGF-LIKE 4. 


FT 


DISULFID 


380'' 


397 


BY SIMILARITY. 




DOMAIN 


177 


215 


EGF-LIKE 5, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


399 1 


408 


BY SIMILARITY. 


Tt 


DOMAIN 


217 


254 


EGF-LIKE 6. 


FT 


DISULFID 


415 ' 


428 


BY SIMILARITY, 


FT 


DOMAIN 


256 


292 


EGF-LIKE 7, CALCIUM" BINDING (POTENTIAL), 


FT 


DISULFID 


422 ' 


437 


BY SIMILARITY. 


FT 


DOMAIN 


294 


332 


EGF-LIKE 8, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


439 ' 


448 


BY SIMILARITY. 


FT 


DOMAIN 


334 


370 


EGF-LIKE 9, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


455-' 


466 


BY SIMILARITY. 


FT 


DOMAIN 


371 


409 


EGF-LIKE 10. 


FT 


DISULFID 


460 


475 


BY SIMILARITY. 


FT 


DOMAIN 


411 


449 


EGF-LIKE 11, CALCIUM-BINDING (POTENTIAL) . 


FT 


DISULFID 


477. 


486 


BY SIMILARITY. 


FT 


DOMAIN 


451 


487 


EGF-LIKE 12, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


493 ' 


503 


BY SIMILARITY, 


FT 


DOMAIN 


489 


524 


EGF-LIKE 13, CALCIUM-BINDING (POTENTIAL). 


FT 


DISULFID 


498 


512 


BY SIMILARITY. 


FT 


DOMAIN 


526 


562 


EGF-LIKE 14, CALCIUM-BINDING (POTENTIAL), 


FT 


DISULFID 


514 ' 


523 


BY SIMILARITY, 


FT 


DOMAIN 


■ 564 


599 


EGF-LIKE 15, CALCIUM-BINDING (POTENTIAL) . 


FT 


DISULFID 


• .530' 


541 


BY SIMILARITY. 
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Query Match 8.8%; Score 732.5; DB 1; Length 2437; 

Best Local Similarity 21,6%; Pred, No. 7.3e-38; 

Matches 323; Conservative 144; Mismatches 468; Indels 557; Gaps 66; 

Qy 431 LAQNPFICDCHLKW LADYLHTNPIETSGARCT 462 

I : II I I III Ml :| :|: 
Db 119 LTLDTFTCRCQPGWSGKTCQLADPCASNPC-ANGGQCSAFESHYICTCPPNFHGQTCRQD 177 

Qy 463 SPRRLANRRIGQIRSRRFRC--SGTEDYRSRLSGDCFADLACPERCRCEGTT 512 

III I :: I II I : :| I III II 
Db 178 VNECAVSPSPCRNGGTCINEVGSYLCRCPPEYTGPHCQRLYQPCL PSPCRSGGTC 232 

Qy 513 VDCSN--QKLNKIPEHIPQYTAELRLNNNEFTVLEATGIFKRLPQLRKINFSNNRITDIE 570 

II: : :| III::: II I : II I 
Db 233 VQTSDTTHTCSCLPGFTGQ-TCEHNVDDCTQHACENGG PCIDGINTYNCHCDRHW 286 

Qy 571 EGAFEGASGVNEILLTSNRLEN--VQHKMFKGLESL 604 

I : |:| |: I :| I I : 
Db 287 TGQY-CTEDVDECELSPNACQNGGTCHNTIGGFHCVCVNGWTGDDCSENIDDCASAACSH 345 

Qy .605 RTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAF 646 

:| :| hi lh : I I II 
Mb 346 GATCHDRVASFPSECPHGRTGLLCHLDDACISNPCQKG-SNCDTNPVSGKAICTCPPGYT 404 

^ 647 DTL-HSLSTLNLLANP FNCNCYLAWLGEWLRKKRIVTGNPRCQKPY 691 

: : :| III I I I : I III: 

Db 405 GSACNQDIDECSLGANPCEHGGRCLNTKGSFQCKCLQGYEG PRCEM-- 450 

Qy 692 FLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCPTECTCLDTV VRCS* 737 

III : I : INI : I I 

Db 451 DVNECKS-NPCQNDATCLDQIGGFHCICMPGYEGVFCQI 488 

Qy 738 NRGLRVLPRGIPRDVTELYLD GNQFTLVPKEL 769 

I III : :: :| hi I 

Db 489 NSDDCASQPCLNGKCIDKINSFHCECPKGFSGSLCQVDVDECASTPCRNGAKCTDGP— 545 

Qy 770 SNYKHLTLIDLSNNRISTLSNQSFS NMTQLLTLILSYNRLRCIPPRTFDGLRSL 823 

I: : II :: : : I I ||: I 
Db 546 NKYTCECTPGFSGIHCELDINECASSPCHYGVCR DGVASF 585 

Qy 824 RLLSLHGNDISWP EGAFNDLSALSHLAIGANPLYCDCNM 863 

II : I: I II: : I : I: I: 
Db 586 TCDCRPGYTGRLCETNINECLSQPCRNGGTCQDRENAY ICTC PKGTTGVNC E I NI 640 

Qy 864 QWLS — DWVKSEYR EPG IARCAGPGEMADRLLLTT P 897 

1:11 III : I I I : 



Db 641 DDCRRRPCDYGRCIDKINGYECVCEPGYSGSMCNINIDDCALNPCHNGGTCIDGV" 



Qy 898 SKKFTC • • •QGPVDVNILAKCNPCLSNPCRNDGTCNSDPVDFYRCTCPYGFKGQDCDVPI 954 

III I I I:: I I Mil : M I :: III I h |::lh I 
Db 696 -NSFTCLCPDGFRDATCLSQHNECSSNPCIH-GSC-LDQINSYRCVCEAGWMGRNCDINI 752 

Qy 955 HACISNPCRHGGTC8LREGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGIN 1014 

: Mill :|||| : |: I I II I l|::|:::| I I I :|:| : 
Db 753 NECLSNPCVNGGTC— RDMTSGYLCTCRAGFSGPNCQMNINECASNPCLNQGSCIDDVA 809 

Qy 1015 NYTCLCPPEYTGELCE ERLDF CAQDL- 1040 

: I I lllhll III lh 

Db 810 GFRCNCMLPYTGEVCENVLAPCSPRPCRNGGVCRESEDFQSFSCNCPAGWQGQTCEVDIN 869 

Qy 1041 — NPCQHDSRCILTPRGFRCDCTPGYVGEHCDIDFDDCQDN 1079 

III : . I 11:1 I II: I I: I III: I 
Db 870 ECVRNPCTNGGVCENLRGGFQCRCNPGFTGALCENDIDDCEPNPCSNGGVCQDRVNGFVC 929 

Qy 1080 KCKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRT 1118 

: Ml :||| II III II Ml: II : I 
Db 930 VCLAGFRGERCAEDIDECVSAPCRNGGNCTDCVNSYTCSCPAGFSGINCEINTP 983 

Qy 1119 SPCDNFDCQNGAQCIVRINEPICQCLPGYQGERCEKLVSVNFINRESYLQIPSARVRPQT 1178 

I I I! I: I: I Nil: II: : II 

Db 984 -DCTESSCFNGGTCVDGISSFSCVCLPGFTGNYCQH DVNECDSRPCQ 1029 

Qy 1179 NITLQIATDEDSGILLYRGDRDHIAVEL-YRGRVR A 1213 

I I II I I : II I 

Db 1030 N GGSCQDGYGTYRCTCPHGYTGLNCQSLVRWCDSSPCRNGGSCWQQGASFTCQCA 1084 

Qy 1214 SYDTGSHPASAI Y -SVET INDGNFHIVELLALDQSLSLS VDGGNPRIITNL 1263 

I II III ::: I: I I :|:: II II hi 

Db 1085 SGWTG :IYCDVPSVS CEVAARQQGVSVAVLCRHAGQCVDAGN — THL 1128 

Qy 1264 SRQSTLNFDSPLYVGGMPG— RSNVASLRQAPGQNGTS FHGCIR 1305 

: I I : I : I III : M 

Db 1129 CRCQ AGYTGSYCQEQVDECQPNPCQNGATCTDYLGGYSCECVPGYHG— 1175 

Qy 1306 NLYINSELQDFQRVPMQTGILPGCEPC HKRVCAHGT CQPSSQ- 1347 

: : |: : I I I I :| I II III 
Db 1176 -MNCSREINECLSQPCQNG GTC IDLVNTYKCSCPRGTQGVHCE IDIDDCSPSVDP 1229 

Qy 1348 : AGFTCECQEGWMGPLCDQRTNDPCLGNRCVHGTCLPINAFS - - 1388 

I: I I l::l h I: I: I I ::: 
Db 1230 LTGEPRCFNGGRCVDRVGGYGCVCPAGFVGERCEGDVNE CLSDPCDPSGSYNCV 1283 

Qy 1389 YSCRCLEGHGGVLCDEEEDLFNPCQAIRCRH -GRCRL -SGLGQPY -CECSSGYTG 1440 

: |:| I: I I I :ll I: lh II : I I hi Ihl 
Db 1284 QLINDFRCECRTGYTGKRC— ETVFNGCRDTPCKNGGTCAVASNTRHGYICRCQPGYSG 1340 

Qy 1441 DSCDREI - SCRGERI RDY YQRQQG YAA - -CQTT KKVSRLECR GGCAGGQCC 1488 

II: : II I h h : I I lh III 
Db 1341 SSCEYDSQSCGSLRCRNGATCVSGHLSPRCLCAPGFSGHECQTRMDSPCLVNPCYNGGTC 1400 

Qy 1489 GPLRSKR — • RRYSFECTDGSSFVDEVEKWRCGCTRC 1523 

I: ' III I II ::| :| 

Db 1401 QPISDAPFYRCSCPANFNGLLCHILDYSFSGGQGRDIAPPVEVEIRCEIAQC 1452 



RESOLT 8 
FBP1_S?RPC 



PRT; 1064 AA. 



FBPl_STRPO STANDARD; 
P10079; 

01-MAR-1989 ,(Rel. 10, Created) 

01-FEB-1996 (Rel; 33, Last sequence update) 

30-MAY-2000 (Rel. 39, Last annotation update) 

FIBROPELLIN I PRECURSOR (EPIDERMAL GROWTH FACTOR-RELATED PROTEIN 1) 

(OEGF-1). 

EGF1, 

Strongylocentrotus purpuratus (Purple sea urchin). 
Eukaryota; Metazoa; Echinoderraata; Eleutherozoa; Echinozoa; 
Echinoidea; Euechinoidea; Echinacea; Echinoida; Strongylocentrotidae; 
Strongylocentrotus. 
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[i] 

SEQUENCE FROM N.A. 
MEDLINE-90112459; PubMed-2514273; 
Delgadillo-Reynoso M.G., Rollo D.R., Hursh D.A., Raff R.A.; 
"Structural analysis of the uEGF gene in the sea urchin 
strongylocentrotus purpuratus reveals more similarity to vertebrate 
than to invertebrate genes with EGF-like repeats."; 
J. Mol. Evol. 29:314-327(1989). 
[21 

SEQUENCE OF 279-4716 AND 781*1064 FROM N.A. 
MEDLINE=87319677^PubMed=3498216; 
Hursh D.A., Andrews M.E., Raff R.A.; 
"A sea urchin gen< encodes a polypeptide homologous to epidermal , 
growth factor."; 
Science 237:1487-1490(1987). 
[3] 

AVIDIN-LIKE DOMAIN. 
MEDLINE=89196806;toubMed=2784773; 
Hunt L.T., Barker fc.C; 

"Avidin-like domain in an epidermal growth factor homolog from a s 
urchin/; I 
FASEB J. 3:1760-1764(1989). j 
[4] i 
CHARACTERIZATION, i 
MEDLINE-91285254 ; WbMed-2060714 ; 
Bisgrove B.W., Andrews M.E., Raff R.A.; 
"Fibropellins, products of an EGF repeat-containing gene, form a 
unique extracellular matrix structure that surrounds the sea urchin 
embryo."; 

Dev. Biol. 146:89-99(1991). 

-!- FUNCTION: FORM THE APICAL LAMINA, A COMPONENT OF THE EXTRACELLULAR 
MATRIX. 

-!• SUBCELLULAR LOCATION: EXTRACELLULAR. IN VESICLES IN THE CYTOPLASM 
OF UNFERTILIZED EGGS, THEN TO THE BASE OF THE HYALIN LAYER 
THROUGHODT DEVELOPMENT AND FINALLY IN THE APICAL LAMINA IN LATE 
EMBRYOS AND EARLY LARVAE. 

-!- ALTERNATIVE PRODUCTS: 2 ISOFORMS; IA (SHOWN HERE) AND IB; ARE 
PRODUCED BY ALTERNATIVE SPLICING. THE SMALL FORM (IB) LACKS 8 EGF 
REPEATS. 

-!- DEVELOPMENTAL STAGE: MODERATE LEVELS IN UNFERTILIZED EGGS AND 
DURING EARLY CLEAVAGE, THEN RAPIDLY INCREASES IN ABUNDANCE BETWEEN 
LATE MORULA AND MESENCHYME BLASTULA STAGES TO MAXIMAL LEVELS 
MAINTAINED THROUGH SUBSEQUENT STAGES. EXPRESSED BOTH MATERNALLY 
AND ZYGOTICALLY. 

-I- SIMILARITY: CONTAINS 21 EGF-LIKE DOMAINS. 

-I- SIMILARITY: CONTAINS 1 CUB DOMAIN. 

-!- SIMILARITY: THE C-TERMINAL DOMAIN OF THIS PROTEIN IS SIMILAR 
TO AVIDIN/STREPTAVIDIN. 

This SWISS -PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation • 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://ww.isb-sib.ch/announce/ 
or send an email to license@isb-sib,ch). 



EMBL; L08692; AAA62164.1; - 
EMBL; L08692; AAA62163.1; - 
EMBL; X17530; CAA35571.1; - 
EMBL; M17421; AAA30050.1; - 
EMBL; X17533; CAA35573.1; - 
PIR; A29316; A29316. 
HSSP; P01132; 1EPH. 
INTERPRO; IPR000Q88; 
INTERPRO; IPR000152; 
INTERPRO; IPR000561; 
INTERPRO; IPR000859; 
INTERPRO; IPR001438; 
INTERPRO; IPR001881; 



PFAM; PF01382 
PFAM; PF00431 
PFAM; PF00008, 



Avidin; 
CUB; 1, 
EGF; 21 



1. 



DR PRINTS; PR00010; EGFBLOOD. 

DR PROSITE; PS00010; ASXJYDROXYL; 19. 

DR PROSITE; PS00022; EGF_1; 19. 

DR PROSITE; PS00577; AVIDIN; 1. 

DR PROSITE; PS01180; CUB; 1. 

DR PROSITE; PS01186; EGF J; 19. 

DR PROSITE; PS01187; EGF_CA; 19. 

KW Biotin; Alternative splicing; EGF-like domain; Repeat; Signal; 

KW Glycoprotein. 
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DISULFID 


869 


884 


BY SIMILARITY. 


DE 


NEUROGENIC LOCUS NOTCH 3 PROTEIN. 


FT 


DISULFID 


886 


895 


BY SIMILARITY, 


GN 


NOTCR3 . 




DISULFID 


902 


913 


DV CTVTT RDTTV 


OS 


Mus musculus (Mouse) . 


It 


DISULFID 


907 


922 


BY SIMILARITY, 


OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 




DISULFID 


924 


933 


BY SIMILARITY, 


OC 


Mammalia; Eutheria; Rodentia ; Sciurognathi; Muridae; Murinae; Mus. 


FT 


CARBOHYD 


30 


30 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


RN 


[1] 


FT 


CARBOHYD 


136 


136 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


RP 


SEQUENCE FROM N:A. 


FT 


CARBOHYD 


851 


851 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


RC 


STRAIN-ICR X SWISS WEBSTER; 


FT 


VARSPLIC 


477 


780 


MISSING (IN ISOFORM IB). 


RX 


MEDLINE-95001556; PubMed-7918097; 


FT 


CONFLICT 


279 


279 


L -> S (IN REF. 2). 


RA 


Lardelli M., Dalstrand J., Lendahl u.; 


so 


SEQUENCE 1064 AA; 112072 


MW; 2E569CA012ED6D09 CRC64; 


RT 


"The novel Notch homologue mouse Notch 3 lacks specific epidermal 



. Query Match 8.8%; Score 731; DB 1; Length 1064; 

Best Local Similarity 28.2*; Pred. No. 3.1e-38,- 
Matches' 196; Conservative 69; Mismatches 239; Indels 192; Gaps 27; 

Qy 916 NPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGTCHLKEGEE 975 

< I I Ml III I I: : III III 1:1 I II ll::|| I :| 
Db 25^4 NECASSPCLNGGIC-VDGVNMFECTCLAGFTGVRCEVNIDECASAPCQNGGIC--IDG-I 309 

Qy 976 DGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDF 1035 

:|: I I II hill I hi II Nil :| I hi Mil I : :| 
Db 3|.0 NGYTCSCPLGFSGDNCENNDDECSSIPCLNGGTCVDLYNAYMCYCAPGWTGPTCADNIDE 369 

Qy 10 6 CAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNRCKNGAHCTDAVNGYT 1095 

II III: II h III III I lh I hi 1:11 I I llll 
CAS • -APCQNGGVCIDGVNGYMCDCQPGYTGTHCETDIDECARPPCQNGGDCVDGVNGYV 427 

lQf 6 CICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGYQGEKCE- -1153 

III h llll I : Mill h :| :| I III II 
3 CICAPGFDGLNCE NNIDECASRPCQNGAVCVDGVNGFVCTCSAGYTGVLCETD 480 

Qy 1154 --KLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILL YKG DRDH 1201 

: h :| : II :| : ::| I I 

Db 481 INECASMPCLNG GVCTDLVNGYICTCAAGFEGTNCETDTDE 521 

I 

Qy 1202 IAVELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIIT 1261 

I II I: : II : I 

Db 522 CA SFPC QNGATCTDQVNGYVCT 543 

Qy 1262 NLSKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPM 1321 

: III: :::: MM I :: : I 

Db 544 CV PGYTGVL-CETDINECASFPCLNGGT CNDQVNGYVCVCA 583 

Qy 1322 QTGILPGCE— -PCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCV 1377 

I : II I I :| 11111111:111:1: 
Db 584 QDTSVSTCETDRDECASAPCLNGGACMDVYNGFVCTCLPGWEGTNCEINT-DECASSPCM 642 

Qy 1378 HGTCLPINAFSYSCKCLEGHGGVLCDEEED 1407 

:| II I II I h I I I 

Db 643 NGGLCVDQVNSYVCFCLPGFTGIHCGTEIDECASSPCLNGGQCIDRVDSYECVCAAGYTA 702 



Qy 1408 LFNPCQAIKCKHGKCRLSGLGQPYCECSSGYTGDSCDREI ■ SCRG ■ ■ 



RT growth factor -repeats and is expressed in proliferating 

RT neuroepithelial."; 

RL Mech. Dev. 46:123-136(1994). 

CC -I- FUNCTION: NOTCH 1, 2 AND 3 PLAY A COMBINATIONAL ROLE DURING 
CC VARIOUS CELL FATE DECISIONS AND MORPHOLOGICAL MOVEMENTS IN THE 
CC DEVELOPING CNS AND PROBABLY OTHER REGIONS OF THE EMBRYO, 

CC -I- TISSUE SPECIFICITY: PROLIFERATING NEUROEPITHELIUM. 

CC -I- DEVELOPMENTAL STAGE: CNS DEVELOPMENT, 

CC -I- SIMILARITY: '.CONTAINS 34 EGF-LIKE DOMAINS, 

CC -I- SIMILARITY: CONTAINS 3 LIN/NOTCH REPEATS. 

CC -I- SIMILARITY: CONTAINS 6 CDC10/SWI6 REPEATS. 

cc 

CC This SWISS-PROT .entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 

DR EMBL; X74760; CAA52776.1; -. 

DR HSSP; P00740; 1IXA. 

DR MGD; MGI: 99460; NOTCH3, 

DR INTERPRO; IPR000152; -. 

DR INTERPRO; IPR000561; -. 

DR INTERPRO; IPR000800; -. 

DR INTERPRO; IPR001438; -. 

DR INTERPRO; IPR001881; -. 

DR INTERPRO; IPR002110; -. 

DR PFAM; PF00008; EGF; 34. 

DR PFAM; PF00023; £nk; 6. 

DR PFAM; PF00066; notch; 3. 

DR PRINTS; PR00010; EGFBLOOD. 

DR PROSITE; PS50088; ANRJEPEAT; 4. 

DR PROSITE; PS50297; ANK_REP_REGION; 1. 

DR PROSITE; PS00010; ASXJYDROXYL; 18. 

DR PROSITE; PS00022; EGF.l; 33. 

DR PROSITE; PS01186; EGF.2; 27. 

DR PROSITE; PS01187; EGF.CA; 17, 

KW Differentiation; Neurogenesis; Repeat; EGF-like domain; Transmembrane; 

KW Glycoprotein , 

FT DOMAIN 1- 1643 EXTRACELLULAR. 

FT TRANSMEM 1644. 1664 POTENTIAL. 
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FT 


DOMAIN 


1665 


2318 


CYTOPLASMIC. 


FT 


DOMAIN 


39 


1374 


34 X EGF-TYPE REPEATS. 


FT 


DOMAIN 


1388 


1503 


3 X LIN/NOTCH REPEATS. 


FT 


DOMAIN 


1784 


1998 


6 X CDC10/SWI6 REPEATS. 


FT 


DOMAIN 


2242 


2261 


PEST . * 


FT 


DOMAIN 


39 


78 


EGF-LIKE 1. 


FT 


DOMAIN 


79 


119 


EGF-LIKE 2. 


FT 


DOMAIN 


120 


157 


EGF-LIKE 3. 


FT 


DOMAIN 


159 


196 


EGF-LIKE 4, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


198 


235 


EGF-LIKE 5. 




DOMAIN 


237 


273 


EGF-LIKE 6, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


275 


313 


EGr LIKE /, 


FT 


DOMAIN 




351 


EGF-LIKE 8, CALCIUM-BINDING (POTENTIAL). 




DOMAIN 


352 


390 


EGF-LIKE 9. 


™ 


DOMAIN 


392 




EGF-LIKE 10, CALCIUM-BINDING (POTENTIAL) 


FT 


DOMAIN 


432 


468 


EGF-LIKE 11, CALCIUM-BINDING (POTENTIAL) 


FT 


DOMAIN 


470 


506 


EGF-LIKE 12, CALCIUM-BINDING (POTENTIAL) 


I 


DOMAIN 


508 


544 


EGF-LIKE 13, CALCIUM-BINDING (POTENTIAL) 


1 


DOMAIN 


546 


581 


EGF-LIKE 14, CALCIUM-BINDING (POTENTIAL) 


Wl 


DOMAIN 


583 


619 


EGF-LIKE 15, CALCIUM-BINDING (POTENTIAL) 


FT 


DOMAIN 


621 


656 


EGF-LIKE 16, CALCIUM-BINDING (POTENTIAL) 


FT 


DOMAIN 


658 


694 


EGF-LIKE 17, CALCIUM-BINDING (POTENTIAL) 


FT 


DOMAIN 


696 


731 


EGF-LIKE 18, 


FT 


DOMAIN 


735 


771 


EGF-LIKE 19. 


FT 


DOMAIN 


772 


809 


EGF-LIKE 20. 


FT 


DOMAIN 


811 


848 


EGF-LIKE 21, CALCIUM-BINDING (POTENTIAL) 


FT 


DOMAIN 


850 


886 


EGF'LIKE 22, CALCIUM-BINDING (POTENTIAL) 


FT 


DOMAIN 


888 


923 


EGF-LIKE 23, CALCIUM-BINDING (POTENTIAL) 




DOMAIN 


925 


961 


EGF'LIKE 24. 


FT 


DOMAIN 


963 


999 


EGF-LIKE 25. 


FT 


DOMAIN 






Pf^P.T TITP 

EGF'LIKE *D , 


FT 


DOMAIN 


1037 


1083 


EGF'LIKE 27. 


FT 


DOMAIN 


1085 


1121 


EGF-LIKE 28. 


FT 


DOMAIN 


1123 


1159 


EGF'LIKE 29, CALCIUM-BINDING (POTENTIAL) 


FT 


DOMAIN 


1161 


1204 


EGF-LIKE 30, CALCIUM-BINDING (POTENTIAL) 


FT 


DOMAIN 


1206 


1245 


EGF-LIKE 31. 


FT 


DOMAIN 


1247 


1288 


EGF-LIKE, 32, 


FT 


DOMAIN 


1290 


1326 


EGF-LIKQ433 . 


FT 


DOMAIN 


1336 


1374 


EGF-LIKEJ34. 


FT 


REPEAT 


1388 


1428 


LIN/NOTC1 1. 


FT 


REPEAT 


1429 


1467 


LIN/NOTCH 2. 


FT 


REPEAT 


1468 


1503 


LIN/NOTCH 3. 


FT 


REPEAT 


1784 


1816 


CDC10/SWI6 1. 


FT 


REPEAT 


1817 


1865 


CDC10/SWI6 2. 


FT 


REPEAT 


1866 


1898 


CDC10/SWI6 3. 


£ 


REPEAT 


1899 


1932 


CDC10/SWI6 4. 


1 


REPEAT 


1933 


1965 


CDC10/SWI6 5. 




REPEAT 


1966 


1998 


CDC10/SWI6 6. 


FT 


DISULFID 


43 


55 


BY SIMILARITY. 


FT 


DISULFID 


49 


66 


BY SIMILARITY. 


FT 


DISULFID 


68 


77 


BY SIMILARITY. 


FT. 


DISULFID 


83 


94 


BY SIMILARITY, 


FT 


UlbULrlD 


88 


# 107 


BY SIMILARITY. 


FT 


DISULFID 


109 


118 


BY SIMILARITY. 


FT 


DISULFID 


124 


135 


BY SIMILARITY. 


FT 


DIoULr 1U 


129 


145 


BY SIMILARITY. 


J 


DISULFID 


147 


156 


BY SIMILARITY. 




atcitt PTr\ 


163 


175 


BY SIMILARITY , 




riTcrrr vtt\ 






QV CTUTfaOTfpV 

oi blMlLAiUn , 


FT 


nTcru cm 


186 


195 


OV CTMTr HBTiflV 

oi biMJ-LAKlli , 




riTCriTPTn 
uioubr iu 


202 


213 


HV CTMTT &DTTV 


iZ 


TlTCTTT PTH 




^ 


BY olMlLMITY. 




riTCrrr tin 




234 


BY blMlLAKlII. 


vr 


UloUutli) 


241 


252 


BV CTUTT RDTfllV 

dY SlMlLAKin. 


FT 




246 


261 


RV STMTT1PTTY 


FT 


DISULFID 


263 


272 


BY SIMILARITY. 


FT 


DISULFID 


279 


292 


BY SIMILARITY. 


FT 


DISULFID 


286 


301 


BY SIMILARITY. 


FT 


DISULFID 


303 


312 


BY SIMILARITY. 


FT 


DISULFID 


319 


330 


BY SIMILARITY. . 


FT 


DISULFID 


324 


339 


BY SIMILARITY, 


FT 


DISULFID 


341 


350 


BY SIMILARITY. 


FT 


DISULFID 


356 


367 


BY SIMILARITY. 



FT 


DISULFID 


361 


378 


BY SIMILARITY. 


FT 


DISULFID 


380 


389 


BY SIMILARITY, 


FT 


DISULFID 


396 


409 


BY SIMILARITY, 


FT 


DISULFID 


403- 


418 


BY SIMILARITY. 


FT 


DISULFID 


420 


429 


BY SIMILARITY. 


FT 


DISULFID 


436 


447 


BY SIMILARITY. 


FT 


DISULFID 


441 


456 


BY SIMILARITY. 


FT 


DISULFID 


458' 


467 


BY SIMILARITY, 


FT 


DISULFID 


474 


485 


BY SIMILARITY, 


FT 


DISULFID 


479 


494 


BY SIMILARITY. 


FT 


DISULFID 


496 


505 


BY SIMILARITY. 


FT 


DISULFID 


512 


523 


BY SIMILARITY. 


FT 


DISULFID 


517' 


532 


BY SIMILARITY, 


FT 


DISULFID 


534 ■ 


543 


BY SIMILARITY. 


FT 


DISULFID 


550' 


560 


BY SIMILARITY. 


FT 


DISULFID 


555 


569 


BY SIMILARITY. 


FT 


DISULFID 


571 


580 


BY SIMILARITY, 


FT 


DISULFID 


587.' 


598 


BY SIMILARITY. 


FT 


DISULFID 


592. 


607 


BY SIMILARITY, 


FT 


DISULFID 


609 


618 


BY SIMILARITY. 


FT 


DISULFID 


625 


635 


BY SIMILARITY. 


FT 


DISULFID 


630' 


644 


BY SIMILARITY. 


FT 


DISULFID 


646 


655 


BY SIMILARITY. 


FT 


DISULFID 


662 


673 


BY SIMILARITY. 


FT 


DISULFID 


667' 


682 


BY SIMILARITY. 


FT 


DISULFID 


684 


693 


BY SIMILARITY. 


FT 


DISULFID 


700. 


710 


BY SIMILARITY. 


FT 


DISULFID 


705' 


719 


BY SIMILARITY. 


FT 


DISULFID 


721 


730 


BY SIMILARITY. 


FT 


DISULFID 


739 


750 


BY SIMILARITY. 


FT 


DISULFID 


744' 


759 ' 


BY SIMILARITY. 


FT 


DISULFID 


761. 


770 


BY SIMILARITY. 


FT 


DISULFID 


776 


787 


BY SIMILARITY. 


FT 


DISULFID 


781 


797 


BY SIMILARITY. 


FT 


DISULFID 


799 


808 


BY SIMILARITY, 


FI 


DISULFID 


815-' 


827 


BY SIMILARITY, 


FT 


DISULFID 


821' 


836 


BY SIMILARITY. 


FT 


DISULFID 


838' 


847 


BY SIMILARITY, 


FT 


DISULFID 


854 


865 


BY SIMILARITY. 


FT 


DISULFID 


859'. 


874 


BY SIMILARITY. 


FT 


DISULFID 


876- 


885 


BY SIMILARITY. 


FT 


DISULFID 


892 


902 


BY SIMILARITY. 


FT 


DISULFID 


897' 


911 


BY SIMILARITY. 


FT 


DISULFID 


913 


922 


BY SIMILARITY. 


FT 


DISULFID 


929 • 


940 


BY SIMILARITY. 


FT 


DISULFID 


934 


949 


BY SIMILARITY, 


FT 


DISULFID 


951' 


960 


BY SIMILARITY. 


FT 


DISULFID 


967' 


978 


BY SIMILARITY. 


FT 


DISULFID 


972' 


987 


BY SIMILARITY. 


FT 


DISULFID 


989 


998 


BY SIMILARITY. 


FT 


DISULFID 


1005' 


1016 


BY SIMILARITY. 


FT 


DISULFID 


1010' 


1023 


BY SIMILARITY. 


FT 


DISULFID 


1025/ 


1034 


BY SIMILARITY. 


FT 


DISULFID 


1041' 


1062 


BY SIMILARITY, 


FT 


DISULFID 


1056; 


1071 


BY SIMILARITY. 


FT 


DISULFID 


1073'' 


1082 


BY SIMILARITY, 


FT 


DISULFID 


1089' 


1100 


BY SIMILARITY, 


FT 


DISULFID 


1094, 


1109 


BY SIMILARITY. 


FT 


DISULFID 


1111 ' 


1120 


BY SIMILARITY, 




nTOtTTPTn 


1127" 


1138 


Dl BlHlLAKlll, 


FT 


DISULFID 


1132; 


1147 


BY SIMILARITY. 


FT 


DISULFID 


1149' 


1158 


BY SIMILARITY. 


FT 


DISULFID 


1165 * 


1183 


BY SIMILARITY. 


FT 


DISULFID 


1177' 


1192 


BY SIMILARITY. 


FT 


DISULFID 


1194, 


1203 


BY SIMILARITY. 


FT 


DISULFID 


1210: 


1223 


BY SIMILARITY. 


FT 


DISULFID 


1215' 


1233 


BY SIMILARITY. 


FT 


DISULFID 


' 1235 ' 


1244 


BY SIMILARITY. 



Query Match . 8,74; Score 722;' DB 1; Length 2318; 

Best Local Similarity 23.7%; Pred. No. 3 .le-37; 

Matches 227; Conservative 86; Mismatches 288; Indels 358; Gaps 37; 
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Oy 859 CDCNMQWLSDWVKSEYKEP -G I ARCAG PGEMADKLLLTT PSKKFTC QGPVDVNI 911 

I I |: : : : ::| II! I I Ml III 

Db 66 CLCLPGWVGE- ■ RCQLEDPCHSGPCAGRGVCQSSWAGT - -ARFSCRCLRGFQGP" —D 117 

Qy 912 LAKCNPCLSNP CKNDGTCNS 931 

:: -11:1 I I:: III : 

Db 118 CSQPDPCVSRPCVHGAPCSVGPDGRFACACPPGYQGQSCQSDIDECRSGTTCRHGGTCLN 177 

Qy 932 DPVDFYRCTCPyGFKGQDCDVPIHACISNPCKHGGTCHLKEGEEDGFWCICADGFEGENC 991 

I Ml II I: I I: I: I :||::llll : : I I MM 
Db 178 TPGSF- RCQCPLGYTGLLCENPWPCAPSPCRNGGTC - • RQSSDVTYDCACLPGFEGQNC 234 



1034 



992 EVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEERLD 

Illllll : I I IIMM I I IIIMI: I I :| 
235 EVNVDDCPGHRCLNGGTCVDGVNTYNCQCPPEWTGOFCIEDVDECQLOPNACHNGGICFN 294 



Qy 1035 ■ 



-FCA" 

M 



Db 295 LLGGHSCVCVNGWTGESCSQNIDDCATAVCFHGATCHDRVASFYCACPMGKTGLLCHLDD 354 

■Qy 1038 : QDL NPCQHDSKCILTPK 1054 
i II: 111:1 :|: I 
fa 355 ACVSNPCHEDAICDTNPVSGRAICTCPPGFTGGACDQDVDECSIGANPCEHLGRCVNTQG 414 

Oy 1055 GFKCDCTPGY VGEHCDIDFDDC 1076 

I I I II I :|::| hi 

Db 415 SFLCQCGRGYTGPRCETDVNECLSGPCRNQATCLDRIGQFTCICMAGFTGTYCEVDIDEC 474 

Oy 1077 QDNKCKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFKQNGAQCIVRI 1136 

Mill II MM II |:|| I: I : |:|||:|: : 

Db 475 QSSPCVNGGVCKDRVNGFSCTCPSGFSGSMCQLD VDECASTPCRNGAKCVDQP 527 

Oy 1137 NEPICQCLPGYQGEKCEKLVS VNFINKESYLQIPSAKVRPQTNITLQI 1184 

: 1:1 I:: IIM IM I I M : 
Db 528 DGYECRCAEGFEGTLCERNVDDCSPDPCHHGRCVDGIASFSCACAPG YTGIRCES 582 

Qy 1185 ATDEDSGILLYKGDKDHIAVELYRGRVRASY DTGSHPASAIYSVETIN" 1232 

II IIIMI I 1:1 : Ml 

Db 583 QVDECRSQPCRYGGKCLDLVDKYLCRCPPGTTGVHCEVNIDDCASNPCTFGVCRDGINRY 642 

Qy 1233 DGNFHIVELLALDQSL — SLSVDGGNPKIITNLSKQSTLNFDSPLYVGGM 1280 

111:1 III I IM 
Db 643 DCVCQPGFTGPLCNVEINECASSPCGEGGSCVDGEN GFHCLCPPGSL 689 

Qy 1281 PGKSNVAS LRQAPG QNGTSFHGCIRNLYINSELQDFQKVPMQ 1322 

II: III M I I :H :: : I I 

Db 690 PPLCLPANHPCAHKPCSHGVCHDAPGGFRCVCEPGWSGPRCSQSLAPDA — CESQPCQ 745 

Qy 1323 TG ILPG CE- - - PCHKKVCAH - GTCQPSSQAGPTCECQEGWMG 1360 

I II II II :| I I I: Mill 

Mj 746 AGGTCTSDGIGFRCTCAPGFQGHQCEVLSPCTPSLCEHGGHCESDPDRLTVCSCPPGWQG 805 

Qy 1361 PLCDQRTNDPCLG-NKC-VHGTC--LPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIK 1416 

II I I II M Ml III M I IM II:: I I 

Db 806 PRCOQDV-DECAGASPCGPHGTCTNLPGN— FRCICHRGYTGPFCDQDID— DCDPKP 858 

Qy 1417 CKHGKCRLSGLGQPYCECSSGYTGDSCDREISCRGERIRDYYQKQQGYAACQTTRXVSRL 1476 

till 1:1 II IM I I:: : II 

Db 859 CLHGGSCQDGVGSFSCSCLDGFAGPRCARDVD ECLSSPCGPGTCTDHVASFTC 911 



Qy 1477 ECRGGCAGGQC CGPLRSKRRKYSFECTDGSSFVDEVEKWXCGC TRC 1523 

I I I I II Ml : II M II 

Db 912 ACPPGYGGFHCEIDLPDCSP SSCFNGGTCVDGVSS • FSCLCRPGYTGTHC 960 



RESULT 10 


FT 


SIGNAL 


1 


20 


POTENTIAL. 


NTC4J10USE 


FT 


CHAIN 


21- 


1964 


NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 4 . 


ID NTC4JOUSE STANDARD; PRT; 1964 AA. 


FT 


DOMAIN 


21' 


1443 


EXTRACELLULAR (POTENTIAL). 


AC P31695; Q62389; 


FT 


TRANSMEM 


1444 


1464 


POTENTIAL. 


DT 01-JUL-1993 (Rel. 26, Created) 


FT 


DOMAIN 


1465 : 


1964 


CYTOPLASMIC (POTENTIAL). 


DT 01-NOV-1997 (Rel. 35, Last sequence update) 


FT 


DOMAIN 


21 


60 


EGF-LIKE 1. 


DT 30-MAY-2000 (Rel. 39, Last annotation update) 


FT 


DOMAIN 


61 


112 


EGF-LIKE 2. 


DE NEUROGENIC LOCUS NOTCH HOMOLOG PROTEIN 4 PRECURSOR (TRANSFORMING 


FT 


DOMAIN 


115' 


152 


EGF-LIKE 3. 



DE PROTEIN INT-3). 

GN NOTCH4 OR INT 3 OR INT-3. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-92194507; PubMed-1312643; 

RA Robbins J., Blondel B.J., Gallahan D., Callahan R.; 

RT "Mouse mammary tumor gene int-3: a member of the notch gene family 

RT transforms mammary epithelial cells/; 

RL J. Virol. 66:2594-2599(1992). 

RN [2] 

RP REVISIONS, SEQUENCE FROM N.A. 

RX MEDLINE-97294599; PubMed-9150355; 

RA Gallahan D. ( Cailahan R.; 

RT "The mouse mammary tumor associated gene INT3 is a unique member of 

RT the NOTCH gene family (NOTCH4) . "; 

RL Oncogene 14:1883-1890(1997). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE-LUNG, AND TESTIS; 

RX MEDLINE-96281668; PubMed-8681805; 

RA Uyttendaele H,,. Marazzi G., Wu G., Yan Q., Sassoon D., Kitajewski J.; 

RT "Notch4/int-3, a mammary proto- oncogene, is an endothelial 

RT cell-specific mammalian Notch gene."; 

RL Development 122:2251-2259(1996). 

CC -I- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

CC -I- DISEASE: ACTIVATED INT-3 TRANSFORMS MAMMARY EPITHELIAL CELLS. 

CC -I- SIMILARITY:, CONTAINS 29 EGF-LIKE DOMAINS. 

CC -I- SIMILARITY: CONTAINS 3 LIN/NOTCH REPEATS. 

CC -I- SIMILARITY: ; CONTAINS 6 CDC10/SWI6 REPEATS, 

CC - ! - SIMILARITY : ' CONTAINS 6 ANK REPEATS . 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed, Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

cc 

D? EMBL; M80456; AAB38377.1; -. 

DR EMBL; 043691; AAC52630.1; -. 

DR PIR; A38072; TVMVT3. 

DR HSSP; P00740; 1IXA. 

DR MGD; MGI: 107471; NOTCH4. 

DR INTERPRO; IPR000152; -. 

DR INTERPRO; IPR000561; -. 

DR INTERPRO; IPR000800; -. 

DR INTERPRO; IPR001438; -. 

DR INTERPRO; IPR001881; -. 

DR INTERPRO; IPR002110; -. 

DR PFAM; PF00008; EGF; 27. 

DR PFAM; PF00023; 6. 

DR PFAM; PF00066; notch; 2. 

DR PRINTS; PR00010; EGFBLOOD. 

DR PROSITE; PS50088; ANKJEPEAT; 5. 

DR PROSITE; PS50297; ANK_REP_REGION; 1. 

DR PROSITE; PS00010; ASXJYDROXYL; 11. 

DR PROSITE; PS00022; EGF.l; 28. 

DR PROSITE; PS01186; EGF.2; 21. 

DR PROSITE; PS01187; EGF.CA; 9. 

KW Differentiation; Neurogenesis; Repeat; EGF-like domain; Transmembrane; 
Glycoprotein; Proto-oncogene; ANK repeat; Signal. 
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1950 


BY SIMILARITY. 




DISULFID 


333 


342 


BY SIMILARITY. 


FT 


DISULFID 


1957 


1968 


BY SIMILARITY. 


FT 


DISULFID 


352 


363 


BY SIMILARITY. 


FT 


DISULFID 


1962: 


1977 


BY SIMILARITY. 


FT' 


DISULFID 


357 


374 


BY SIMILARITY, 


FT 


DISULFID 


1979 


1988 


BY SIMILARITY. 


FT 


DISULFID 


376 


385 


BY SIMILARITY. 


FT 


DISULFID 


1995' 


2008 


BY SIMILARITY. 




LUoULr J. I) 






DV CTMTT AOTTV 

DI oIMIIAKITl , 


FT 


DISULFID 


2002' 


2017 


BY SIMILARITY. 


FT 


riTcriTFTri 


707 




ov GTWTT rdttv 
DI alMlliAKill. 


FT 


DISULFID 


2019 


2028 


BY SIMILARITY. 


PT 


nTcnrPTn 
uiDULr iu 


414 




OV CTMTT 1DTTV 


^ 


LAKdUHIL) 


37 


37 


M-TTHirrn trxrvtKn \ /tmtie'mttiit \ 
W'LlMKbD (bblTO,, . .) (TOIbNUAL) . 






431 


442 


OV CTMTT SDTTV 




lAKBUHID 


96, 


96 


N'LIWKlD (IjLUJAI. . .) (PUIMIiAL) . 




UlOULf 1JJ 


436 


451 


nV CTMTT HDTTV 


FT 
nl 


LAKdUHIU 


198 


198 


N-LINKED (GLCNAC, . .) (POTENTIAL). 




riTcni i?ti*i 
UloULr 1U 


453 


462 


DV CTMTT &DTTV 


FT 


LAKBUhiD 


238: 


238 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


nTcriT.FTn 

LUDUIjJ! lu 






OV CTMTT SBTTV 
DI DlMlljnl\J.l 1 , 


FT 


PiDnnuvn 
UftiuJUnlU 






H-TTMVPn /^TPMM" 1 \ 1 DATCXTTTftT ^ 

N LlWftkU (bbUMll, , ,) (fUlbNllAii) , 




LU5ULI LU 


473 


AflB 
!ftft 


OV CTMTT SDTTV 


FT 


PRDDAUVH 

UAKaUHID 


lor 

336 * 


336 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


UlaUU ID 


490 


499 


OV CTMTT MJTTV 


FT 


CARBOHYD 


400 


400 


N" LINKED (GLCNAC, . .) (POTENTIAL), 


FT 


DISULFID 


505 


515 


BY SIMILARITY, 


FT 


CARBOHYD 


550. 


550 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


i 


DISULFID 


509 


520 


BY SIMILARITY. 


FT 


CARBOHYD 


565, 


565 


N-LINKED (GLCNAC, , ,) (POTENTIAL), 


I 


DISULFID 


522 


531 


BY SIMILARITY. 


FT 


CARBOHYD 


736 ( 


736 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




DISULFID 


549 


562 


BY SIMILARITY. 


FT 


CARBOHYD 


746' 


746 


N-LINKED (GLCNAC, , .) (POTENTIAL). 


FT 


DISULFID 


556 


569 


BY SIMILARITY. 


FT 


CARBOHYD 


860- 


860 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


DISULFID 


571 


580 


BY SIMILARITY. 


FT 


CARBOHYD 


884 


884 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


DISULFID 


586 


597 


BY SIMILARITY. 


FT 


CARBOHYD 


976 : 


976 


N-LINKED (GLCNAC, . .) (POTENTIAL), 


FT 


DISULFID 


591 


602 


BY SIMILARITY, 


FT 


CARBOHYD 


1102 


1102 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


DISULFID 


604 


610 


BY SIMILARITY, 


FT 


CARBOHYD 


1114 


1114 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


DISULFID 


613 


624 


BY SIMILARITY. 


FT 


CARBOHYD 


1138' 


1138 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


DISULFID 


618 


634 


BY SIMILARITY, 


FT 


CARBOHYD 


1192' 


1192 


N-LINKED (GLCNAC, , .) (POTENTIAL). 


FT 


DISULFID 


636 


645 


BY SIMILARITY, 


FT 


CARBOHYD 


1245 


1245 


N-LINKED (GLCNAC. , .) (POTENTIAL), 


FT 


DISULFID 


652 


664 


BY SIMILARITY. 












FT 


DISULFID 


659 


673 


BY SIMILARITY. 


Query Match 




7.6%; Score 634.5; DB 1; Length 2139; 


FT 


DISULFID 


675 


684 


BY SIMILARITY, 


Best Local Similarity 24.3%; Pred. Ho. 8.4e-32; 


FT 


luoULr ID 


691 


702 


BY dIMILARITi. 


Matches 226; 


Conservative 


91; Mismatches 273; Indels 341; Gaps 


FT 


DISULFID 


696 


711 


BY SIMILARITY. 










FT 


DISULFID 


713 


722 


BY SIMILARITY. 


oy 


703 IQDFTCD — 




■-DGNDDNSCSPLSRCP TECTCLD 731 


FT 


DISULFID 


729 


740 


BY SIMILARITY. 


1 1: 


II ' 




MM II 1 III 


FT 


DISULFID 


734 


749 


BY SIMILARITY. 


Db 


407 INGFSCDCSGTGYTGAFCQTNVDECDKNPCLNGGRCLHTYGWYTCQCLDGWGGEICDRPM 466 


FT 


DISULFID 


751 


760 


BY SIMILARITY. 












FT 


DISULFID 


767 


778 


BY SIMILARITY. 


Qy 


732 • -TWRCSNK3LKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLIDLSNNRISTLS 789 


FT 


DISULFID 


772 


787 


BY SIMILARITY. 






:| 1 1 


1 1 1 


Mill: III:: 


FT 


DISULFID 


789 


799 


BY SIMILARITY. 


Db 


467 TCQTQQCFNGG 


TCLDKPI- 


GFQ-CLCPPEYTG-ELCQIAPSCAQQCPID 512 


FT 


DISULFID 


806 


817 


BY SIMILARITY. 












FT 


DISULFID 


811 


826 


BY SIMILARITY. 


Qy 


790 NQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISVVPEGAFNDLSALSH 849 


FT 


DISULFID 


828 


837 


BY SIMILARITY, 






II 


III :ll :: 1 II h 


FT 


DISULFID 


844 


855 


BY SIMILARITY. 


Db 


513 SECVGGKCVCXPGSSGYN-- 


-CQTSTGDGASALALTPINCN ATNG'KCLNG 559 


FT 


DISULFID 


849 


890 


BY SIMILARITY. 












£ 




892 


901 


BY SIMILARITY, 


Qy 


850 LAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQGP— 906 




DISULFID 


908 


919 


BY SIMILARITY. 






1 :K 


: : 1 


II II: II 1 




DISULFID 


913 


928 


BY SIMILARITY. 


Db 


| 560 GTCSMNGTHCYCAVGYSGD- 


RC — EKAEN CSPLNCQEPMVC 597 


FT 


DISULFID 


930 


939 


BY SIMILARITY. 












FT 


DISULFID 


946 


957 


BY SIMILARITY. 


Qy 


907 VDVNILAK--T 


CNPCLSNPCKNDG TCNSDPVD 935 


FT 


DISULFID 


952 


966 


BY SIMILARITY, 




1 


1 • 


Ml:! 


1:1 1 II :! II 


FT 


DISULFID 


968 


977 


BY SIMILARITY. 


Db 


598 VQNQCLCPENKVCNQCATQPCQNGGECVDLPNGDYECKCTRGWTGRTCGND-VDECTLHP 656 


FT 


DISULFID 


984 


995 


BY SIMILARITY, 












FT 


DISULFID 


989 


1009 


BY SIMILARITY. 


Qy 


936 




- FYRCTC PYGFKGQDCDVP I HAC I SNPCKHGGT CHLKEGEEDGFWC IC 982 


FT 


DISULFID 


1011 


1020 


BY SIMILARITY. 








1:1 1 


II 1 II : 1:1 II :| llll : : hi 


FT 


DISULFID 


1211 


1222 


BY SIMILARITY. 


Db 


657 KICGNGICKNEKGSYKCYCTPGFTGVHCDSDVDECLSFPCLNGATCHNK— INAYECVC 713 


FT 


DISULFID 


1216 


1231 


BY SIMILARITY. 












FT 


DISULFID 


1233 


1242 


BY SIMILARITY. 


Qy 


983 ADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFCAQDLNP 1042 


FT 


DISULFID 


1485 


1496 


BY SIMILARITY. 




1:1 


llllll 


:|:| 1 


1 1 III:! 111:11 M 1 :h :ll 1 1 


FT 




1490 


1505 


DV CTMTT 1VDTTV 

di blMlLAKll I . 


Db 


714 QPGYEGENCEVDIDECGSNPCSNGSTCIDRINNFTCNCIPGMRGRICDIDIDDCVGD--P 771 


FT 


DISULFID 


1507 


1516 


BY SIMILARITY, 












FT 


UisULf ID 


1763 


1774 


BY SIMILARITY. 


Qy 


1043 CQHDSKCILTPKGFKCDCT-PGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCICPEG 1101 


FT 


illSULFID 


1768 


1783 


BY SIMILARITY, 


1 : 


:|| 


Ihlll: 


II Ihl::: hi II III 1 II III 1 






1785 *>1794 


DV OTUTT HDTTV 

di iJ-MILAKin , 


Db 


772 CLNGGQCIDQLGGFRCDCSGTGYEGENCELNIDECLSNPCTNGAKCLDRVKDYFCDCHNG 831 


FT 


DISULFID 


1801 


1812 


BY C.TMTT.ARTTV 

DI dlnlLAAll I t 












FT 


DISULFID 


1806 


1821 


BY SIMILARITY. 




1102 YSGLFCE— • 




FSPP M 1113 


FT 


DISULFID 


1823 


1832 


BY SIMILARITY. 




- i 1 


II • 




II 1 : 


FT 


DISULFID 


1839 


1850 


BY SIMILARITY. 


Db 


832 YKGKNCEQDINECESNPCQYNGNCLERSNITLYQMSRITDLPKVFSQPFSFENASGYECV 891 


FT 


DISULFID 


1844 


1859 


BY SIMILARITY. 












FT 


DISULFID 


1861 


1870 


BY SIMILARITY. 


Qy 


1114 VLP" 






RTSP 1120 


FT 


DISULFID 


1878 


1889 


BY SIMILARITY. 


:| 






1 :| 


FT 


DISULFID 


1883 


1903 


BY SIMILARITY. 


Db 


892 CVPGIIGKNCSININECDSNPCSKHGNCNDGIGTYTCECEPGFEGTHCEINIDECDRYNP 951 


FT 


DISULFID 


1905 


1914 


BY SIMILARITY. 
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Ov 


1121 c DNFDCQ NGAQCI -VRINEP — - ICQC 1143 


CC 


between 


the Swiss Institute of Bioinformatics and the EMBL outstation • 


1 I " ! I 1 1 1 • 1 1 1 1 I 
i i • ' 1 1 ii i > in ii 


CC 


the European Bioinformatics Institute. There are no restrictions on its 


Db 


952 CQRGTCYDQIDDYDCDCDANYGGKNCSVLLKGCDQNPCLNGGACLPYLINEVTHLYTCTC 1011 


CC 


use by 


non-profit institutions as long as its content is in no way 






CC 


modified and this statement is not removed. Usage by and for commercial 


Qy 




CC 


entitles requires a license agreement (See http://www.isb-sib.ch/announce/ 




I'll-IMI I'll -II -I II 1 -hi 
l • ll • 1 1 1 1 I'll 'll 'ill l • l • l 


CC 


or send an email to license@isb-sib.ch) . 


Db 




CC 
























DR 


EMBL; X56811; CAA40148.1; -■ 




Qy 


1194 LY * • KGDKDH — IfiVELYRGRVEASYDTGSHPASAI YSVET INDGNFHIVELLALDQSL 1248 


DR 


EMBL; M35759; AAA28938.1; *. 






1 . i . .ii ii 1 1 1 ' 1 1 ' 1 < • I 

1 > 1 • 'II II 1 1 1 • 1 1 - 1 ■ ' 1 


DR 


PIR; A36666; A36666. 




Db 


1 fififl aPCT1'fiPIfNirDVCVTr,PTTMr!DTM..HCCTT,OTlIPrtWTfiC)rT.HnCHira7V-irviTMT , C- 1 1 1 d 
1UUU nfulluijMDrVdllLEjLlHljIUjPlb njDbLflWIVijyVJ; luOAbftllJOnnillvV fvnlHld 1110 




PIR; S16878; S16878 . 








DR 


HSSP; P00743; 1WH 


E. 




flv 

WJ 




DR 


FLYBASE; FBgn0004197; Ser. 






1 ■ • • 1 II 1 1 • 1 1 ■ 1 1 1 • 1 1 "II 
1 ■ ■ • 1 ■ II 1 1 • 1 1 • 1 1 1 • 1 1 "II 




INTERPRO; 


IPR000152; -. 




Db 


1117 Hf.VT,<!ANnPnATPPVR<!YPTAHN«!np(!irpP, f rYI^RTTP!JI,K<lYT,RHT,THnP 1APVG 1 1 79 


DR 


INTERPRO; 


IPR000561; -. 










INTERPRO; IPR001438; -. 




Qy 




nn 


INTERPRO; IPR001774; -. 




I-- :| : I I : : II III I I I 


DR 


INTERPRO; 


IPR0018 


81; -. 




Db 


1173 CMQDIMVNGKWIFPDEQDMISYTKLENVQSGCPRTEQCKPSPCHSNVECTDLWH 1227 


DR 


PFAM; PF01414; DSL; 1. 








DR 


PFAM; PFO 


0008; EGF; 11. 






1345 SSQAGFTCECQEGWMGPLCDQRTNDPCLGNK 1375 


DR 


PRINTS; PR00010; 


EGFBLOOD, 






III : 1 I- 


DR 


PROSITE; PS00010; ASXJYDROXYL; 7. 




1228 — TFACHCPRPFFGHTCQHNMTAATFGHE 1254 


DR 


PR0SITE; PS00022; EGF_1; 14. 








DR 


PROSITE; 


PS01186; 


EGF J; 8. 








DR 


PROSITE; 


PS01187; 


EGF.CA; 5. 




RESULT 12 




Differentiation; Repeat; EGF-like domain; Transmembrane; 


SERf 


.DROME 


KW 


Glycoprotein; Signal. 




ID 


SERR.DROME STANDARD; PRT; 1408 AA. 




SIGNAL 


1 


83 


POTENTIAL. 


AC 


P18168; 


FT 


CHAIN 


84' 


1408 


SERRATE PROTEIN. 


DT 


01-NOV-1990 (Rel. 16, Created) 


FT 


DOMAIN 


84 


1223 


EXTRACELLULAR (POTENTIAL) . 


DT 


Ql-JUL-1993 (Rel. 26, Last sequence update) 


FT 


TRANSMEM 


1224 


1249 


POTENTIAL. 


DT 


15-JOL-1999 (Rel. 38, Last annotation update) 


FT 


DOMAIN 


1250- 


1408 


CYTOPLASMIC (POTENTIAL). 


DE 


SERRATE PROTEIN PRECURSOR (BEADED PROTEIN). 


FT 


DOMAIN 


284, 


317 


EGF-LIKE 1. 


GN 


SER OR BD. 


FT 


DOMAIN 


315' 


349 


EGF-LIKE 2. 


OS 


Drosophila melanogaster (Fruit fly) . 


FT 


DOMAIN 


351: 


389 


EGF-LIKE 3. 


OC 


Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 


FT 


DOMAIN 


391.'. 


489 


EGF-LIKE 4. 


OC 


Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 




DOMAIN 


407. 


476 


SER-RICH (INSERT). 


OC 


Ephydroidea; Drosophilidae; Drosophila. 


FT 


DOMAIN 


491' 


527 


EGF-LIKE 5, CALCIUM-BINDING (POTENTIAL), 


RN 


[1] 


FT 


DOMAIN 


529 " 


609 


EGF-LIKE 6, CALCIUM-BINDING (POTENTIAL). 


RP 


SEQUENCE FROM N,A. 


pi 


DOMAIN 


611. 


646 


EGF-LIKE 7, CALCIUM-BINDING (POTENTIAL), 


RC 


STRAIN-OREGON-R; 




DOMAIN 


648 


684 


EGF-LIKE 8, CALCIUM-BINDING (POTENTIAL). 


RX 


MEDLINE-91347903; PubMed-1840519; 


pt 


DOMAIN 


686- 


721 


EGF-LIKE 9, 


RA 


Thomas U., Speicher S.A., Knust E.; 


FT 


DOMAIN 


723' 


797 


EGF-LIKE 10. 


RT 


"The Drosophila gene Serrate encodes an EGF-like transmembrane 




DOMAIN 


737' 


769 


THR-RICH (INSERT). 


RT 


protein with a complex expression pattern in embryos and wing 
discs."; 


pt 


DOMAIN 


799, 


835 


EGF-LIKE 11, CALCIUM-BINDING (POTENTIAL). 


RT 


w 
*J 


DOMAIN 


837: 


877 


EGF-LIKE 12. 


RL 


Dejelopment 111:749-761(1991). 


FT 


DOMAIN 


879'. 


915 


EGF-LIKE 13. 


RN 






DOMAIN 


917' 


953 


EGF-LIKE 14, CALCIUM-BINDING (POTENTIAL). 


RP 


SEQUENCE FROM N.A. 


pt 


DIS0LFID 


288' 


299 


BY SIMILARITY. 




MEDLINE-91099666; PubMed-2125287; 


PT> 


DISULFID 


292: 


305 


BY SIMILARITY. 


f 


Fleming R.J., Scottgale T.N., Diederich R.J., Artavanis-Tsakonas S.; 
"The gene Serrate encodes a putative EGF-like transmembrane protein 


pm 


DISULFID 


307 


316 


BY SIMILARITY, 




FT 


DISULFID 


319 


330 


BY SIMILARITY. 


RT 


essential for. proper ectodermal development in Drosophila 


FT 


DISULFID 


325' 


337 


BY SIMILARITY. 


RT 


melanogaster,"; 




DISULFID 


339 


348 


BY SIMILARITY. 


RL 


Genes Dev. 4:2188-2201(1990). 


FT 


DISULFID 


355' 


367 


BY SIMILARITY. 


CC 


-!• FUNCTION: ESSENTIAL FOR PROPER ECTODERMAL DEVELOPMENT. SERRATE 


FT 


DISULFID 


361 


377 


BY SIMILARITY. 


CC 


MAY. REPRESENT AN ELEMENT IN A NETWORK OF INTERACTING MOLECULES 




DISULFID 


379' 


388 


BY SIMILARITY. 


CC 


OPERATING AT THE CELL SURFACE DURING THE DIFFERENTIATION OF 


FT 


DISULFID 


395 


406 


BY SIMILARITY, 


CC 


CERTAIN TISSUES. 


FT 


DISULFID 


400:. 


477 


BY SIMILARITY. 


CC 


-!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 




DISULFID 


479 


488 


BY SIMILARITY, 


CC 


-!- TISSUE SPECIFICITY: APPEARS TO BE RESTRICTED EXCLUSIVELY TO 


PT 

si 


DISULFID 


495: 


506 


BY SIMILARITY. 


CC 


CELLS OF ECTODERMAL ORIGIN. 




DISULFID 


500 


515 


BY SIMILARITY, 


CC 


•!■ MISCELLANEOUS: SEPARATION OF NEUROBLASTS FROM THE ECTODERM INTO 




DISULFID 


517, 


526 


BY SIMILARITY, 


CC 


THE INNER PART OF EMBRYO IS ONE OF THE FIRST STEPS OF CNS 


FT 


DISULFID 


533 


588 


BY SIMILARITY. 


CC 


DEVELOPMENT IN INSECTS, THIS PROCESS IS UNDER CONTROL OF THE 


FT 


DISULFID 


582. 


597 


BY SIMILARITY. 


CC 


NEUROGENIC GENES. 


FT 


DISULFID 


599. 


608 


BY SIMILARITY. 


CC 


-I- MISCELLANEOUS: NOTCH AND SERRATE MAY INTERACT AT THE PROTEIN 


FT 


DISULFID 


615. 


625 


BY SIMILARITY. 


CC 


LEVEL, IT IS CONCEIVABLE THAT THE SERRATE AND DELTA PROTEINS MAY 


FT 


DISULFID 


619 


634 


BY SIMILARITY. 


CC 


• COMPETE FOR BINDING WITH THE NOTCH PROTEIN. 


FT 


DISULFID 


636' 


645 


BY SIMILARITY. 


CC 


-!• SIMILARITY: CONTAINS 14 EGF-LIKE DOMAINS. 


FT 


DISULFID 


652. 


663 


BY SIMILARITY, 


CC 


-!- SIMILARITY: BELONGS TO THE DELTA/SERRATE/JAGGED FAMILY. 


FT 


DISULFID 


657 


672 


BY SIMILARITY. 


CC 




FT 


DISULFID 


674 ■: 


683 


BY SIMILARITY. 


CC 


This SWISS-PROT entry is copyright. It is produced through a collaboration 


FT 


DISULFID 


690 : 


700 


BY SIMILARITY. 
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FT 


DISULFID 


695 


709 


BY SIMILARITY. 




Qy 


1275 LYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPM-QTGILPGCE--- 1330 


FT 


DISULFID 


711 


720 


BY SIMILARITY. 






1:1 III 1 :: II ; 1 1 II 
i ' i in i" ii * ii ii 


FT 


DISULFID 


803 


814 


BY SIMILARITY. 




Db 




FT 


DISULFID 


808 


823 


BY SIMILARITY. 








FT 


DISULFID 


825 


834 


BY SIMILARITY. 




Qy 


1331 -PCHKKVCAH3TCQPSSQAGFTCECQEGVWGPLCDQRINDPCLGNKCVH-GTCL---PIN 1385 


FT 


DISULFID 


841 


852 


BY SIMILARITY. 




1 1 • 1 1 1 1 1 1 1 1 1 1 • 1 1 • 1 • 1 1 1 • 1 
i i ■ i 1 1 1 1 1 mi i • i i • i • 1 1 1 ■ i 


FT 


DISULFID 


846 


865 


BY SIMILARITY. 




Db 


801 NECSPNPCRNGGICLDGDGDFTCECMSGWTGKRCSERATG-CYAGQCQNGGTCMPGAPDK 859 


FT 


DISULFID 


867 


876 


BY SIMILARITY. 








FT 


DISULFID 


883 


894 


BY SIMILARITY, 




Qy 


1386 AFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSGYTGDSCDR 1445 


FT 


DISULFID 


888 


903 


BY SIMILARITY. 






1 |:| | |: | | [ |; | ;| || | | |; |;;| | 


FT 


DISULFID 


905 


914 


BY SIMILARITY. 




Db 


860 ALQPHCRCAPGWTGLFCAEAID- - -QCRGQPCHNGGTCESGAGWFRCVCAQGFSGPDC- - 914 


FT 


DISULFID 


921 


932 


BY SIMILARITY. 








FT 


DISULFID 


926 


941 


BY SIMILARITY. 




Qy 


1446 EISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCA GGQCC 1488 


FT 


DISULFID 


943 


952 


BY SIMILARITY. 






II : 1 hll II 1 


FT 


CARBOHYD 


152 


152 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) . 


Db 


915 RI" NVNECSPQPCQGGATCIDGIGGYSC 941 


FT 


CARBOHYD 


196 


196 


N-LINKED (GLCNAC, . 


.) (POTENTIAL). 






FT 


CARBOHYD 


247 


247 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 






KT 


CARBOHYD 


331 


331 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 


RES 


JLT 13 


1 


CARBOHYD 


412 


412 


N-LINKED (GLCNAC, . 


.) (POTENTIAL), 


LI12.CAEEL 


f 


CARBOHYD 


452 


452 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 


ID 


LI12_CAEEL STANDARD; PRT; 1429 AA. 


FT 


CARBOHYD 


558 


558 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 


AC 


P14585; 


FT 


CARBOHYD 


739 


739 


N-LINKED (GLCNAC. . 


.) (POTENTIAL), 


DT 


01-JAN-1990 (Eel. 13, Created) 


FT 


CARBOHYD 


965 


965 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 


DT 


01-JAN-1990 (Rel. 13, Last sequence update) 


FT 


CARBOHYD 


977 


977 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 


DT 


01-OCT-2000 (Rel. 40, Last annotation update) 


FT 


CARBOHYD 


1004 


1004 


N-LINKED (GLCNAC, , 


.) (POTENTIAL). 


DE 


LIN-12 PROTEIN PRECURSOR. 


FT 


CARBOHYD 


1030 


1030 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 


GN 


LIN-12 OR R107.8. 


FT 


CARBOHYD 


1150 


1150 


N-LINKED (GLCNAC, , 


.',) (POTENTIAL). 


OS 


Caenorhabditis elegans. 


FT 


CONFLICT 


14 


17 


MISSING (IN REF. 2), 


OC 


Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 


FT 


CONFLICT 


27 


27 


P -> A (IN REF. 2), 




OC 


Rhabdltidae; Peloderinae; Caenorhabditis. 


FT 


CONFLICT 


1352 


W352 


T -> S (IN REF. 2). 




RN 


[1] 


SQ 


SEQUENCE 


1408 AA; 150660 MW; 569DA4270A9C7840 


RP 


SEQUENCE FROM N.A, 



Query Match 7.5*; Score 625,5; DB 1; Length 1408; 

Best Local Similarity 24.24; Pred. No. Ue-31; 

Matches 186; Conservative 76; Mismatches 257; Indels 249; Gaps 30; 

Qy 853 GANPLYCDCNMQWLSDWVKSEYKEPGIARC - - AGPGEMADKLLLTTPSKKFTCQGPV- - ■ 907 

I :|:: h II I I : :: :: ] I :| I 

Db 291 GCDPVHGKCD RPGECECRPGWRGPLCNECMVYPGCKHGSCNGSAWKC 337 

Qy 908 "DVN— ILAKCNPCLS NPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHAC 957 

II III: I: III: III : I Mill I |: |:: I I 
Db 338 VCDTNWGGIL--CDQDLNFCGTHEPCKHGGTCENTAPDKYRCTCAEGLSGEQCEIVEHPC 395 



f 



|y 958 ISNPCKHGGTCHLK EGE 974 

: ll::MII II ■ II: ! 

396 ATRPCRNGGTCTLKTSNRTQAQVYRTSHGRSNMGRPVRRSSSMRSLDHLRPEGQALNGSEi 455 



Qy 975 EDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTQ 1018 

I I II I: I ll:|:|:| lh Ihl I : I 
Db . 456 SSGLVSLGSLQLQQQLAPDFTCDCAAGWTGPTCEINIDECAGGPCEHGGTCIDLIGGFRC 515 

Qy 1019 LCPPEYTGELCEEKLDFC 1036 

Mil: h:|: :: I 

Db 516 ECPPEWHGDVCQVDVNECEAPHSAGIAANALLTTTATAIIGSNLSSTALLAALTSAVAST 575 

Qy 1037 AQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTC 1096 

: : II : :| I I I I |: II : III :|:lll I Ml I I 
Db 576 SLAIGPCINAKECRNQPGSFACICKEGWGGVTCAENLDDCV-GQCRNGATCIDLVNDYRC 634 

Qy 1097 ICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGYQGEKCEKLV 1156 

I h:l II I 1:1 :|: : : II II I ||: 

Db 635 ACASGFTGRDCETD IDECATSPCRNGGECVDMVGKFNCICPLGYSGSLCEEA- 686 

Qy 1157 SVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELYRGRVRASYD 1216 

II: I I : :| I : II 
Db 687 KENCTPSPCL EGHCLNTPEGYYCHCPPD RA-- 716 

Qy 1217 TGSH * - PASAI YSVETINDGNFHIVELLALDQSLSLSVDGGNPKI ITNLSKQSTLNFDSP 1274 

II : I 1:1 I II I : :| I 

Db 717 -GKHCEQLRPLCSQPPCNEGCFANVSL ATSATTTTTTTTTATTTRKMAKP 765 



RC STRAIN-BRISTOL N2; 

RX MEDLINE-88334747; PubMed-3419531; 

ra Yochem J., Weston L, Greenwald I.; 

RT "The Caenorhabditis elegans lin-12 gene encodes a transmembrane 

RT protein with overall similarity to Drosophila Notch."; 

RL Nature 335:547-550(1988). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL S2; 

RX MEDLINE-94150718; PubMed-7906398; 

RA Wilson R, , Ainscough R., Anderson K. , Baynes C, Berks M., 

RA Bonfield J., Burton J., Connell M., Copsey T,, Cooper J,, Coulson A., 

RA Craxton M., Dear S., Du z., Durbin R., Favello A., Fraser A., 

RA Fulton L,,' Gardner A., Green P., Hawkins T., Hillier L., Jier M., 

RA Johnston L., Jones M. ( Kershaw J.', Kirsten J,, Laisster N., 

RA Latreille P. 7 Lightning J., Lloyd C, Mortimore B., O'Callaghan M., 

RA Parsons J,, Percy C, Rifken L,, Roopra A., Saunders D., Shownkeen R. , 

RA Sims M. , Smaldon N., Smith A., Smith M. , Sonnhanuner E. , Staden R., 

RA Sulston J,/ Thierry-Mieg J., Thomas L, Vaudin M., Vaughan K. ( 

RA Waterson R., Watson A., Weinstock L. , Wilkinson-Sproat J,, 

RA Wohldman P.; 

RT "2,2 Mb of contiguous nucleotide sequence from chromosome III of C. 

RT elegans.*; : 

RL Nature 368:32-33(1994). 

CC -!- FUNCTION: LIN-12 IS INVOLVED IN SEVERAL CELL FATES DECISIONS THAT 
CC REQUIRES CELL-CELL INTERACTIONS. IT IS POSSIBLE THAT LIN-12 
CC ENCODES A MEMBRANE -BOUND RECEPTOR FOR A SIGNAL THAT ENABLES 
CC EXPRESSION OF THE VENTRAL UTERINE PRECURSOR CELL FATE, 

CC ■!- SUBCELLULAR' LOCATION : TYPE I MEMBRANE PROTEIN. 

CC -I- SIMILARITY:. HIGH, TO C. ELEGANS GLP-1. 

CC -I- SIMILARITY: '.CONTAINS 13 EGF-LIKE DOMAINS. 

CC -I- SIMILARITY: CONTAINS 3 LIN/NOTCH REPEATS. 

CC - ! - SIMILARITY : l CONTAINS 6 ANK REPEATS . 

CC 

CC This SWISS -PROT- entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CG' use- --by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed, Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license?isb-sib.ch) , 

CC ■ 
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DR 


EMBL; M12069; AAA70191.1; -. 




FT 


DISULFID 


375' 


90 BY SIMILARITY. 




DR 


EMBL; Z14092; CAA78474.1; -. 




FT 


DISULFID 


392; 


01 BY SIMILARITY. 




DR 


PIR; S06434; S06434. 






DISULFID 


408' 


19 BY SIMILARITY. 




DR 


HSSP; P00740; 1IXA. 




FT 


DISULFID 


413 : 


29 BY SIMILARITY. 




DR 


WORMPEP; 


R107.8; CE00274. 






DISULFID 


431. 


40 BY SIMILARITY. 




DR 


INTERPRO, 


IPR000152; -. 




FT 


DISULFID 


507: 


18 BY SIMILARITY. 




DR 


INTERPRO, 


IPR000561; -. 




FT 


DISULFID 


512 v 


29 BY SIMILARITY. 




DR 


INTERPRO, 


IPR0008 


00; -. 




FT 


DISULFID 


531: 


40 BY SIMILARITY, 




DR 


INTERPRO, 


IPR001881; -. 




FT 


DISULFID 


547' 


58 BY SIMILARITY. 




DR 


INTERPRO, 


IPRQ02110; -. 




FT 


DISULFID 


552: 


67 BY SIMILARITY. 




DR 


PFAM; PFO 


0008; EGF; 13, 




FT 


DISULFID 


569 : 


78 BY SIMILARITY. 




DR 


PFAM; PF00023; ank; 4. 




FT 


DISULFID 


586; 


97 BY SIMILARITY. 




DR 


PFAM; PF00066; notch; 3. 




FT 


DISULFID 


591' 


07 BY SIMILARITY. 




DR 


PROSITE; 


PS50088; ANKJEPEAT; 3. 


FT 


DISULFID 


609; 


18 BY SIMILARITY, 




DR 


PROSITE; PS50297; ANK.REPJEGION; 1. 


FT 


CARBOHYD 


41' 


41 N-LINKED (GLCNAC. 


. .) (POTENTIAL). 


DR 


PROSITE; PS00010; ASX_HYDROXYL; 3. 




CARBOHYD 


165' 


65 N-LINKED (GLCNAC. 


. ,) (POTENTIAL), 


DR 


PROSITE; PS00022; 


EGF_1; 12. 




FT 




194. 


94 N-LINKED (GLCNAC. 


. .) (POTENTIAL). 


DR 


PROSITE; PS01186; EGF_2 ; 11. 




FT 


CARBOHYD 


378' 


78 N-LINKED (GLCNAC. 


. .) (POTENTIAL). 


DR 


PROSITE; PS01187; 


EGF.CA; 2. 




FT 


CARBOHYD 


515' 


15 N-LINKED (GLCNAC. 


. .) (POTENTIAL). 


KW 


Differentiation; 


Repeat; ANK 


repeat; EGF-like domain; Transmembrane; 


FT 


CARBOHYD 


623.' 


23 N-LINKED (GLCNAC. 


. .) (POTENTIAL). 


KW 


Glycoprotein; Signal. 




FT 


CARBOHYD 


751. 


51 N-LINKED (GLCNAC. 


. .) (POTENTIAL). 


|T 


SIGNAL 


1 


15 


POTENTIAL, 


FT 


CARBOHYD 


754 


54 N-LINKED (GLCNAC. 


. .) (POTENTIAL). 




CHAIN 


16 


1429 


LIN- 12 PROTEIN. 


FT 


CARBOHYD 


900' 


00 N-LINKED (GLCNAC. 


. .) POTENTIAL). 


It 


DOMAIN 


16 


908 


EXTRACELLULAR (POTENTIAL). 


so 


SEQUENCE 


1429 :AA; 


157115 MW; 255EDD7A62C025DB CRC64; 


FT 


TRANSMEM 


909 


931 


POTENTIAL. 














DOMAIN 


932 


1429 


CYTOPLASMIC (POTENTIAL). 












FT 


DOMAIN 


24 


618 


13 X EGF-TYPE REPEATS. 


Query Match 




7.3%; Score 606; DB 1; 


Length 1429; 


FT 


DOMAIN 


631 


750 


3 X LIN/NOTCH REPEATS . 


B 


,st Local Similarity 


23.6%; Pred. No. 3e-30; 


FT 


DOMAIN 


1046 


1266 


6 X ANK MOTIF REPEATS. 


Matches 216; 


Conservative 110; Mismatches 289; Indels 302; Gaps 44; 


FT 


DOMAIN 


20 


61 


EGF-LIKE 1. 












FT 


DOMAIN 


114 


150 


EGF-LIKE 2. 


Qy 


648 TLHSLSTLNLL— 


- - - - ANPFN CNCYLAWLGEWLRKKRIVTGNPRCQKPYF 692 




DOMAIN 


152 


190 


EGF-LIKE 3, CALCIUM-BINDING (POTENTIAL). 




:|| 


1 1 1: 


III 1 1 : II: 


:: h =1 1 


FT 


DOMAIN 


201 


246 


EGF-LIKE 4. 


Db 


18 SLHIGSCLGLICGRNGHCHAGPVNGTQTSYWCRCDEGFGGEYCEQQCDVSKCGADERCVF 77 


FT 


DOMAIN 


250 


285 


EGF-LIKE 5. 












FT 


DOMAIN 


287 


323 


EGF-LIKE 6. 


Qy 


693 LKEIPIQDVAIQDFTCD DGNDDNSCSPLSRCPTECTCLDTWRCSN 738 


FT 


DOMAIN 


323 ^ 363 


EGF-LIKE 7. 


1: 


:: :\ 


II 1 II 


1 h 1 :l 1 1 


FT 


DOMAIN 


365 


402 


EGF-LIKE 8, CALCIUM-BINDING (POTENTIAL). 


Db 


78 DKDYRMETCVCKD- 


CDINGNSLLKPSCPSGYGGDD 


--CKTQGWCYPSV-CMN 125 


FT 


DOMAIN 


404 


441 


EGF-LIKE 9. 












FT 


DOMAIN 


449 


492 


EGF-LIKE 10. 


Qy 


739 KGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLIDLSNNRISTLSNQSFSNMTQ 798 


FT 


DOMAIN 


503 


541 


EGF-LIKE 11. i 




1 








FT 


DOMAIN 


543 


579 


EGF-LIKE 12. 


Db 


126 GG— 






127 




DOMAIN 


582 


619 


EGF-LIKE 13. 












FT 


REPEAT 


635 


669 


LIN/NOTCH 1. 


Qy 


799 LLT LILSYNRLRC I PPRTFDGLKSLRLLSLHGNDISWPEGAFKDLSALSHLAIGANPLY 858 


FT 


REPEAT 


670 


710 


LIN/NOTCH 2. 


1 


• lh! 


1 II 1 1 II: 


1 : :: 1 


FT 


REPEAT 


711 


750 


LIN/NOTCH 3. 


Db 


128 --QCIGAGNRAKCACP---DGFKGER-CELDVNECEENKNACGNRSTCMNTL GTYI 177 


FT 


REPEAT 


1046 


1078 


ANK MOTIF 1. 












FT 


REPEAT 


1079 


1119 


ANK MOTIF 2. 


Qy 


859 CDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQGPVDVNILAKCNPC 918 


FT 


REPEAT 


1120 


1152 


ANK MOTIF 3. 




1 1 


:| ' 


1 II : II 


1 : 1 1 


FT 


REPEAT 


1153 


1188 


ANK MOTIF 4. 


Db 


178 CVCPQGFLP— - 


PDCLKPGNTS T VEFKQPVC — FLEIS ADHPDG 216 


AT 


REPEAT 


1189 


1232 


ANK MOTIF 5. 














REPEAT 


1233 


1266 


ANK MOTIF 6. 


Qy 


919 LSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDV- - PIHACISNPCKHGGTCHLKEGEED 976 


1 


DISULFID 


24 


35 


BY SIMILARITY, 


1 hi 1 J: 


:| II I: 1 I:: :| 


MM III 1 


FT 


DISULFID 


29 


49 


BY SIMILARITY, . 


Db 


217 RSMYCQNGGFCDKAS - - - SKCQCPPG YHGSTCELLEKEDSCASNPCSH - GVCISFSG - - - 269 


FT 


DISDLFID 


51 


60 


BY SIMILARITY. 












FT 


DISOLFID 


118 


129 


BY SIMILARITY. 


Qy 


977 GFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFC 1036 


FT 


DISDLFID 


123 


138 


BY SIMILARITY. 


II II 


1 lh 1 


: hi :| II 1 |::hhl 


Mil II 1 Ihl 1 


FT 


DISOLFID 


140 


149 


BY SIMILARITY. 


Db 


270 GFQCICDDGYSGSYCQEGKDNCVNNKCEAGSKCINGVNSYFCDCPPERTGPYC-EKMD-C 327 


FT 


DISULFID 


156 


169 


BY SIMILARITY. 












FT 


DISULFID 


163 


178 


BY SIMILARITY. 


Qy 


1037 AQDLNPCQHDSKCI 


- -LTPKGFKCDCTPGYVGEHCDIDFDDC-QDNKCKNGAHCTDAVN 1092 


FT 


DISULFID 


180 


189 


BY SIMILARITY. 






11:11 


1: 1 1:1 1 III 1 1:1: 


1 :l 1 1 1 : 


FT 


DISULFID 


205 


227 


BY SIMILARITY. 


Db 


328 SAIPDICNHGT-CIDSPLSEKAFECQCEPGYEGILCEQDKNECLSENMCLNNGTCVNLPG 386 




DISULFID 


221 


234 


BY SIMILARITY. 












FT 


DISULFID 


236 


245 


BY SIMILARITY. 


Qy 


1093 GYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCI-VRINEPICQCLPGYQGEK 1151 


FT 


DISULFID 


254 


264 


BY SIMILARITY. 




: 1 


1 h-l : 


: 1: : 1 :l |:| |: 


: hill h h: 


FT 


DISULFID 


259 


273 


BY SIMILARITY. 


Db 


387 SFRCDCARGFGGKWCD--EPL NMCQDFHCENDGTCMHTSDHSPVCQCKNGFIGKR 439 


FT 


DISULFID 


275 


284 


BY SIMILARITY. 












FT 


DISULFID 


291 


302 


BY SIMILARITY. 


Qy 


1152 CEKLVSVNFIKKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELYRGRV 1211 


FT 


DISULFID 


296 


311 


BY SIMILARITY. 




III 


: 1 . 


II :: hi II 


:| 1 


FT 


DISULFID 


313 


322 


BY SIMILARITY. 


Db 


440 CEKECPIGF-- — 


GGVR--CDLRLEI GICSRQGGK 468 


FT 


DISULFID 


327 


339 


BY SIMILARITY. 












FT 


DISULFID 


334 


351 


BY SIMILARITY. 


Qy 


1212 RASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLSKQSTLNF 1271 


FT 


DISULFID 


353 


362 


BY SIMILARITY. 








:| 1 :: 


FT 


DISULFID 


369 


381 


BY SIMILARITY. 


Db 


469 




CFNGG--KCLSGFC V 481 



' *, Best Available Copy 
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Qy 1272 DSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQTGILPGCEP 1331 

1:1 | : : |:: | : |: ;| 

Db 482 CPPDFTG NQCEVNRKNGKSSLSENLCL SDP 511 

Qy 1332 CHKKVCAHGTC-QPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGTCLPINAFSYS 1390 

I : II : I: I |::|: I :| :| I II I I :| : hi 

Db 512 CM----NATCIDVDAHIGYACICKQGFEGDIC-ERHKDLCLENPCSNGGVCHQHRESFS 566 

Qy 1391 CKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGL GQPYCECSSGYTGDSC 1443 

I I I I I:: : :| |: |: . |: II |::| I 

Db 567 CDCPPGFYGNGCEQE KMFRCLKSTCQNGGVCINEEEKGRK-CECSYGFSGARC 618 

Qy 1444 DREISCRGERI RDY YQKQQGY AACQTT KKVSRL - - -ECRGGC AGGQCCGPLRS 1493 

: :|; I :| : : |: I I I I II I I | 

Db 619 EEKINLTGFTEKDSLLR — SVCEKRKCSERANDGNCDADCNYAACKFDGGDCSG - ■ ■ \ 670 



1494 KRRKYSFECTDGSSFVD 1510 

II :l :| I: I 
671 KREPFS-KCRYGNMCAD 686 



RESULT 14 

FBP3 STRPU 

ID FBP3.STRPU STANDARD; PRT; 570 AA. 

AC P49013; 

DT 01-FEB-1996 (Rel. 33, Created) 

DT 01-FEB-1996 (Rel. 33, Last sequence update) 

DT 01-NOV-1997 (Rel. 35, Last annotation update) 

DE FIBROPELLIN C PRECURSOR (EPIDERMAL GROWTH FACTOR-RELATED PROTEIN 3) 

DE (EGF III) (FIBROPELLIN III). 

GN EGF3 . 

OS Strongylocentrotus purpuratus (Purple sea urchin). 

OC Eukaryota; Metazoa; Echinodermata; Eleutherozoa; Echinozoa; ] 

OC Echinoidea; Euechinoidea; Echinacea; Echinoida; Strongylocentrotidae; 

OC Strongylocentrotus . | 

RN [1] 

RP SEQUENCE FROM N. A. ! 

RC TISSUE-GASTRULA; 

RX MEDLINE-93273088; PubMed-8500658; 

RA BtegroveiB.W., Raff R.A.; 

RT "ijhe SpE^F III gene encodes a member of the fibropellins: EGF repeat - 

RT containing proteins that form the apical lamina of the sea urchin 

RT embryo.'; ( 

RL Dev. Biol. 157:526-538(1993). 

!- FUNCTION: fORM THE APICAL LAMINA, A COMPONENT OF THE EXTRACELLULAR 

MATRIX. { 
!- SUBCELLULAR LOCATION: EXTRACELLULAR. 

CC -!- DEVELOPMENTAL STAGE: LOW LEVELS IN UNFERTILIZED EGGS AND DURING 

CC EARLY CLEAVAGE, THEN RAPIDLY INCREASES IN ABUNDANCE BETWEEN LATE 

CC MORULA AND MESENCHYME BLASTULA STAGES TO MAXIMAL LEVELS MAINTAINED 

CC THROUGH SUBSEQUENT STAGES, 

CC -I- MISCELLANEOUS: EXPRESSED BOTH MATERNALLY AND ZYGOTICALLY. 

CC -I- SIMILARITY: CONTAINS 8 EGF-LIRE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 1 CUB DOMAIN. 

CC •!- SIMILARITY: THE C -TERMINAL DOMAIN OF THIS PROTEIN IS SIMILAR 

CC TO AVIDIN/STREPTAVIDIN. 

CC -- 

CC This SWISS -PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation • 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

cc 

DR 
DR 
DR 



i 



EMBL; L07045; AAA30045.1; -. 
HSSP; P00740; 1IXA. 
INTERPRO; IPR000088; 
INTERPRO; IPR000152; 
INTERPRO; IPR000W1; 
INTERPRO; IPR000859; 
INTERPRO; IPR001438; 



DR INTERPRO; IPR001881; -. 

DR PFAM; PF01382; Aviflin; 1. 

DR PFAM; PF00431; CUB; 1. 

DR PFAM; PF00008; EGF; 8. 

DR PRINTS; PR00010; EGFBL00D, 

DR PROSITE; PS00010; ASXJYDROXYL; 8. 

DR PROSITE; PS00022; EGF J.; 8. 

DR PROSITE; PS00577; AVIDIN; 1, 

DR PROSITE; PS01180; CUB; 1. 

DR PROSITE; PS01186; EGF J; 7. 

DR PROSITE; PS01187; EGF.CA; 6. 

KW Biotin; EGF-'like domain; Repeat; Signal; Glycoprotein, 



FT 


SIGNAL 


!■ 


17 


POTENTIAL. 


FT 


CHAIN 


18 ■ 


570 


FIBROPELLIN C, 


FT 


DOMAIN 


18' 


55 


EGF -LIKE 1. 


FT 


DOMAIN 


62 


175 


CUB. 


FT 


DOMAIN 


176 


212 


EGF -LIKE 2, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


214 : 


250 


EGF -LIKE 3, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


252, 


288 


EGF-LIKE 4, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


290. 


326 


EGF-LIKE 5, CALCIUM-BINDING (POTENTIAL). 


FT 


DOMAIN 


328 


364 


EGF-LIKE 6, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


366' 


402 


EGF-LIKE 7. 


FT 


DOMAIN 


404 


440 


EGF-LIKE 8, CALCIUM-BINDING (POTENTIAL), 


FT 


DOMAIN 


442- 


570 


AVIDIN-LIKE. 


FT 


DISULFID 


23- 


34 


BY SIMILARITY. 


FT 


DISULFID 


28- 


43 


BY SIMILARITY. 


FT 


DISULFID 


45, 


54 


BY SIMILARITY. 


FT 


DISULFID 


180' 


191 


BY SIMILARITY. 


FT 


DISULFID 


185' 


200 


BY SIMILARITY, 


FT 


DISULFID 


202 


211 


BY SIMILARITY, 




nTcnT.vTn 

\JXO\iUS XV 


218 


229 


Wf ^TMTI.ftBTTV 
Dl OinlLtnnlll . 


FT 


DISULFID 


223 


238 


BY SIMILARITY. 


FT 


DISULFID 


240' 


249 


BY SIMILARITY. 


FT 


DISULFID 


256 


267 


BY SIMILARITY. 


FT 


DISULFID 


261" 


276 


BY SIMILARITY. 


FT 


DISULFID 


278 


287 


BY SIMILARITY. 


FT 


DISULFID 


294 ' 


305 


BY SIMILARITY. 


FT 


DISULFID 


299 


314 


BY SIMILARITY. 


FT 


DISULFID 


316' 


325 


BY SIMILARITY. 


FT 


DISULFID 


332 


343 


BY SIMILARITY. 


FT 


DISULFID 


337' 


352 


BY SIMILARITY. 


FT 


DISULFID 


354'-' 


363 


BY SIMILARITY. 


FT 


DISULFID 


370' 


381 


BY SIMILARITY, 


FT 


DISULFID 


375- 


390 


BY SIMILARITY. 


FT 


DISULFID 


392' 


401 


BY SIMILARITY. 


FT 


DISULFID 


408' 


419 


BY SIMILARITY. 


FT 


DISULFID 


413' 


428 


BY SIMILARITY, 


FT 


DISULFID 


430' 


439 


BY SIMILARITY. 


FT 


CARBOHYD 


30' 


30 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


1361 


136 


N* LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


357,: 


357 


N- LINKED (GLCNAC. . .) (POTENTIAL) . 



SQ SEQUENCE 570 AA; 61116 MW; BE665E3E1C05E6EE CRC64; 



Query Match ; 6.6%; Score 549.5; DB 1; Length 570; 

Best Local Similarity 39.7%; Pred. No. 3.2e-27; 

Matches 96; Conservative 33; Mismatches 98; Indels 15; Gaps 5; 

Qy 918 CLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGTCHLKEGEEDG 977 

I III I II I I: hi II I :!: I M ||::|| I : :| 
Db 180 CTPNPCLNGATC-VDQVNDYQCICAPGFTGDNCETDIDECASAPCRNGGAC--VDQVNG 235 

Qy 978 FWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFCA 1037 

: I I II I III h::| I I MINI : I I I III III :: II 
Db 236 YTCNCIPGFNGVNCENNINECASIPCLNGGICVDGINQFACTCLPGYTGILCETDINECA 295 

Qy 1038 QDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCI 1097 

' ' : II I : II ' : III h I :|: : ::| MM II hll I 
Db 296 S-SPCQNGGSCTDAVNRYTCDCRAGFTGSNCETNINECASSPCLNGGSCLDGVDGYVCQ 353 

Qy 1098 CPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGYQGEKCEKLVS 1157 

I hi II I I : Mil I : :hlllll I II : 

Db 354 CLPNYTGTHCSIS LDACASLPCQNGGVCTNVGGDYVCECLPG YTG INCE - - ID 404 
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1*158 VN 1159 
:| 

405 IN 406 



GLP1. 

ID 

AC 

DT 

DT 

DT 

DE 

GN 

OS 

OC 

OC 



,T 15 
.CAEEL 

GLP1_CAEEL STANDARD; PRT; 1295 AA. 
P13508; 

01-JAN-1990 (Rel. 13, Created) 
01-JAN-1990 (Rel. 13, Last sequence update) 
01-OCT-200Q (Rel. 40, Last annotation update) 

GLP-1 PROTEIN PRECURSOR. 
GLP-1 OR EMB-33 OR F02A9.6. 
Caenorhabditis elegans. 

Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

Rhabditidae; Peloderinae; Caenorhabditis. 

11] 

SEQUENCE FROM N.A. 
STRAIN=BRISTOL N2; 
MEDLINE-89336787; PubMed-2758466; 
Yochem J., Greenwald I.; 

"glp-1 and lin-12, genes implicated in distinct cell-cell 
interactions in C, elegans, encode similar transmembrane proteins."; 
Cell 58:553-563(1989). 
12] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
MEDLINE-94150718; PubMed-7906398; 
Wilson R., Ainscough R., Anderson K., Baynes C, Berks M,, 
Bonfield J,, Burton J., Connell M., Copsey T,, Cooper J., Coulson A., 
Craxton M., Dear S., Du Z,, Durbin R., Favello A., Fraser A., 
Fulton L., Gardner A., Green P.,Hawkins T., Hillier L., Jier M., 
Johnston L., Jones M., Kershaw J., Kirsten J., Laisster N., 
Latreille P., Lightning J., Lloyd C, Mortimore B,, O'Callaghan M., 
Parsons J., Percy C, Rifken L., Roopra A,, Saunders D., Shownkeen R., 
Sims M., Smaldon N., Smith A., Smith M,, Sonnhammer E., Staden R., 
Sulston J., Thierry-Mieg J., Thomas K., Vaudin M., Vaughan K., 
Waterson R., Watson A., Weinstock L., Wilkinson-Sproat J,, 
Wohldman P.; 

"2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 



Nature 368:32-38(1994). 
[3] 

DELETION OF 1174-1295. 

MEDLINE-91351288; PubMed=1881436; 

Mango S.E., Maine E.M., Kimble J.; 

"Carboxy- terminal truncation activates glp-1 protein to specify 

vulval fates in Caenorhabditis elegans."; 

Nature 352:811-815(1991). 

[4] 

CHARACTERIZATION OF FUNCTION OF THE ANR-REPEATS. 
MEDLINE-93354444; PubMed-8350921; 
Roehl H., Kimble J.; 

"Control of cell fate in C. elegans by a GLP-1 peptide consisting 
primarily of ankyrin repeats."; 
Nature 364:632-635(1993). 
[5] 

FUNCTION. 

MEDLINE-94208066; PubMed=8156602; 

Mello C.C, Draper B.W., Priess J.R.; 

"The maternal genes apx-1 and glp-1 and establishment of 

dorsal -ventral polarity in the early C. elegans embryo."; 

Cell 77:95-106(1994). 

-!- FUNCTION: INVOLVED IN THE SPECIFICATION OF THE CELL FATES OF THE 
BLASTOMERES, ABA AND APA. PROPER SIGNALING BY GLP-1 INDUCES ABA 
DESCENDANTS TO PRODUCE ANTERIOR PHARYNGEAL CELLS, AND APA 
DESCENDANTS TO ADOPT A DIFFERENT FATE. CONTRIBUTES TO THE 
ESTABLISHMENT THE DORSAL-VENTRAL AXIS IN EARLY EMBRYOS. 

-!- SUBCELLULAR LOCATION; TYPE I MEMBRANE PROTEIN. 

-I- DEVELOPMENTAL STAGE: ACTS ON ABP DEVELOPMENT DURING 4 -CELL AND 
12-CELL STAGES, AND ON ABA DEVELOPMENT DURING 12'CELL AND 28-CELL 



STAGES. ■ 

- SIMILARITY: 'HIGH, TO C. ELEGANS LIN-12. 

■ SIMILARITY: ' CONTAINS 10 EGF-LIKE DOMAINS. 

- SIMILARITY : ' CONTAINS 3 LIN/NOTCH REPEATS, 

■ SIMILARITY: : CONTAINS 6 ANK REPEATS, 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license?isb-sib.ch). 

EMBL; M25580; AAA28058.1; -. 
EMBL; Z19555; CAA79620.1; -. 
EMBL; Z29116; CAA79620.1; JOINED. 
EMBL; Z29116; CAA82373.1; -. 
EMBL; Z19555; CAA82373.1; JOINED. 1 
PIR; A32901; A32901, 
HSSP; P00740; 1IXA. 
WORMPEP; F02A9.6; CE00237. 
INTERPRO; IPR000152; -. 
INTERPRO; IPR000561; -. 
INTERPRO; IPR000800; -. 
INTERPRO; IPR001881; -. 
INTERPRO; IPR002110; -. 
PFAM; PF00008; EGF; 10. 
PFAM; PF00023; ank; 4. 
PFAM; PF00066; notch; 3. 
PR0SITE; PS50088; ANKJEPEAT; 3. 
PR0SITE; PS50297; AM.REPJEGION; 1. 
PROSITE; PS00010; ASXJYDROXYL; 2. 



DR 


PROSITE; 


PS00022 


EGF.l; 10, 




DR 


PROSITE; 


PS01186 


EGF J; 8. 




DR 


PROSITE; 


PS01187; EGF.CA; 1. 




KW 


Differentiation;' 1 


Repeat; ANK repeat; EGF-like domain; Tra 


KW 


Glycoprotein; Signal. 




FT 


SIGNAL 


1 


15 


POTENTIAL. 


FT 


CHAIN 


16, 


1295 


GLP-1 PROTEIN. 


FT 


DOMAIN 


16 


764 


EXTRACELLULAR (POTENTIAL) , 


FT 


TRANSMEM 


765 , 


786 


POTENTIAL, 


FT 


DOMAIN 


787. 


1295 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


493 


607 


3 X LIN/NOTCH REPEATS. 


FT 


DOMAIN 


988 


1133 


6 X ANK MOTIF REPEATS. 


FT 


DOMAIN 


19' 


58 


EGF-LIKE 1. 


FT 


DOMAIN 


117: 


152 


EGF-LIKE 2. 


FT 


DOMAIN 


154 


190 


EGF-LIKE 3. 


FT 


DOMAIN 


190, 


230 


EGF-LIKE 4. 


FT 


DOMAIN 


232, 


269 


EGF-LIKE 5, CALCIUM-BINDING 


FT 


DOMAIN 


271 


308 


EGF-LIKE 6. 


FT 


DOMAIN 


316 


359 


EGF-LIKE 7. 


FT 


DOMAIN 


369 


406 


EGF-LIKE 8. 


FT 


DOMAIN 


407 


443 


EGF-LIKE 9. 


FT 


DOMAIN 


446 


479 


EGF-LIKE 10. 


FT 


REPEAT 


493 


527 


LIN/NOTCH 1, 


FT 


REPEAT 


528; 


568 


LIN/NOTCH 2. 


FT 


REPEAT 


569 


608 


LIN/NOTCH 3. 


FT 


REPEAT 


915. 


946 


ANK MOTIF 1, 


FT 


REPEAT 


947 


987 


ANK MOTIF 2. 


FT 


REPEAT 


988' 


1019 


ANK MOTIF 3. 


FT 


REPEAT 


1020 


1056 


ANK MOTIF 4. 


FT 


REPEAT 


1057- 


1098 


ANK MOTIF 5. 


FT 


REPEAT 


1099^ 


1133 


ANK MOTIF 6. 


FT 


DISULFID 


23 


35 


BY SIMILARITY. 


FT 


DISULFID 


29 


46 


BY SIMILARITY. 


FT 


DISULFID 


48 


57 


BY SIMILARITY. 


FT 


DISULFID 


121 


131 


BY SIMILARITY, 


FT 


DISULFID 


126 


140 


BY SIMILARITY. 


FT 


DISULFID 


142 


151 


BY SIMILARITY. 


FT 


DISULFID 


158 - 


169 


BY SIMILARITY. 


FT 


DISULFID 


163 


178 


BY SIMILARITY. 


FT 


DISULFID 


180 


189 


BY SIMILARITY. 



I 
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194 


206 


ci SIMILARITY, 




FT 




201 


218 


til oIMllAKllI . 




FT 


DT<;nT,irTn 


220 


229 


TiV OTUTTADTTiV 






nTcnrirTTi 


236 


248 


BY SIMILARITY. 




FT 


DT5nT,PTn 






DV CTUTT RDTTIV 

oi blMILAKIIl. 




FT 


nKnirin 

UiOULi J.U 


259 


268 


BY SIMILARITY. 




FT 


uiovbc iv 


275 


286 


tlV OTUTTiDTTV 

01 OlMlLAKllI, 




FT 


DISULFID 


280 


296 


TJV CTMTTATJTfPV 






DISULFID 


298 


307 


TJV CTUTT ADTTIV 
Dl bi.nlLM.lL I , 




FT 


DISULFID 


373 


384 


BV STMTT 1RTTV 




FT 


DISDLFID 


378 


394 


BY SIMILARITY. 




FT 


DISULFID 


396 


405 


BY SIMILARITY, 




FT 


DISULFID 


411 


422 


BY SIMILARITY, 




FT 


DISULFID 


416 


431 


BY SIMILARITY. 




FT 


DISULFID 


433 


442 


RV QTMTT.1RTTV 




FT 


DISULFID 


450 


461 


BY SIMILARITY. 




FT 


DISULFID 


455 


467 


BY SIMILARITY. 






DISULFID 


469 


478 


BY SIMILARITY, 




i 


CARBOHYD 


244 


244 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) . 




CARBOHYD 


245 


245 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 


FT 


CARBOHYD 


333 


333 


N-LINKED (GLCNAC. . 


.) (POTENTIAL), 


FT 


CARBOHYD 


381 


381 


N-LINKED (GLCNAC. . 


.) (POTENTIAL). 


FT 


CARBOHYD 


609 


609 


N-LINKED (GLCNAC. , 


.) (POTENTIAL). 


FT 


CARBOHYD 


675 


675 


N-LINKED (GLCNAC. . 


,) (POTENTIAL) . 


SO 


SEQUENCE 


1295 


AA; 144078 MW; 422AAD0A2DEEF3B4 CRC64; 



Db 521 FDGGDCSGGQ- -RPF- -SKCQYPARCAD 544 



Search completed: January 22, 2001, 12:24:17 
Job time: 958 sec '• 



Query Match 6.5%; Score 543; DB 1; Length 1295; 

Best Local Similarity 25.6%; Pred. No. 2.4e-26; 

Matches 161; Conservative 75; Mismatches 202; Indels 190; Gaps 33; 

Qy 914 KC NP-CLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVP — IHACISNP 961 

II II I: I::: : I : : I |: I II I :: I |:| 
Db 70 KCIYDVYGENPTCICQDCEDE — TPPTERTQKGCEEGYGGPDCKTPLFSGVNPCDSDP 125 
I 

Qy 962 CKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCP 1021 

I : I I: P II III :|: I II :| I |:| ||||: : II I I 
Db 126 C-NNGLCYPFYG- • -GFQCICNNGYGGSYCEEGIDHCAQNECAEGSTCVNSVYNYYCDCP 181 

Qy 1022 PEYTGELCEEKLDFCAQDLNPCQHDSKCILT— PKGFKCDCTPGYVGEHCDIDFDDCQ- 1077 

:l II II I I I :M I I: I II II h I ::| 
Db 182 IGKSGRYCER--TECALMGNICNH-GRCIPNRDEDKNFRCVCDSGYEGEFCNKDKNECLI 238 

Qy 1078 DNKCKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRIN 1137 
: I I : I : :|| I ||:| :|| : | | :: ||| | j 

IDb 239 EETCVNNSTCFNLHGDFTCTCKPGYAGKYCEEAIDM CKDYVCQNDGYCAHDSN 291 
fcy 1138 E • PICQCLPGYQGEKCEKLVSVNFINKESYLQI PSAKVRPQTNITLQIATDEDSG ILLYK 1196 

: III I I: ::l :: I :: I 
Db 292 QMPICYCEQGFTGQRCE IECPSGFGGIHCDLPLQ 325 

Qy 1197 GDKDHIAVELYRGRVRASYDTGSHPASAIYSVETINDG-— NFHIVELLALDQSLSLSV 1252 

III: III 1:1 
Db 326 RPHCSRSNGT CYNDGRClNGFCVCE 350 

Qy 1253 DGGNPKIITNLSKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSE 1312 

I 1:1 II 
Db 351 PDYIGDR CEIN 361 

Qy 1313 LQDFQKVPMQTGILPGCEPCHRKVCA-HGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPC 1371 

:||: I ': I I : II :|::| I |: I |:| I 

Db 362 RKDFK FPDIQSCKYNPCVNNATCIDLKNSGYSCHCPLGFYGLNCEQHL--LC 411 

Qy 1372 LGNKCVH-GTCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKH-GKCRLSGLGQ 1429 

I : III :| Mill::: I l|: I |: :| 
Db 412 TPTTCANGGTCEGVNGV- IRCNCPNGFSGDYCEIKDRQL- -CSRHPCKNGGVCKNTG- - - 465 

Qy 1430 PYCECSSG YTGDSCDRE ISCRGER - - • IRDYYQKQQGYAACQ - - TTKKVSRLECR 1479 

Nil MM :|: : : III :::: I : : |( 
Db 466 -YCECQYGYTGPTCEEVLVIEKSKETVIRDLCEQRK — CMDLASNGICNPECEEECN 520 

Qy 1480 - • -GGCAGGQCCGPLRSKRRKYSFECTD 1504 I 

I hill I : :| I I 
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GenCore version 4.5 
Copyright (c) 1993 ■ 2000 Compugen Ltd. 



OM proteiiy protein search, using sw model 

Run on: January 22, 2001, 12:19:59 ; Search time 559.8 

(without alignments) 
319.251 Million cell updates/sec 

US-09-540-245A-2 
8316 

1 MRGVGWQMLSLSLGLVLAIL SSFVDEVEKWKCGCTRCVS 1525 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



itrched: 



BLOSOM62 

Gapop 10.0 , Gapext 0.5 



374700 seqs, 117207915 residues 

Total number of hits satisfying chosen parameters: 

I Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0* 
Maximum Match 100% 
Listing first 45 summaries 



Database : 



SPTREMBL.15:* 

sp.archea:* 

spjbacteria:* 

sp.fungi;* 

spjuman:* 

spjnvertebrate:* 

spjnammal:* 

spjnhc:* 

sp.organelle:* 

sp_phage:* 
sp_plant:* 
sp_rodent:* 
sp.virus:* 
spjrertebrate:* 
spjinclassified:* 



Pred, No, is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



esult 




Query 








RT 


regulator of sensory axon elongation and branching."; 


NO. 


Score 


Match Length 


DB ID 


Description 


RL 


Cell 96:771-784(1999), 
[2] 

SEQUENCE FROM N.A. 


1 


8316 


100.0 


1525 


4 Q9Y5Q7 


Q9y5q7 homo sapien 


RN 
RP 


2 


8265 


99.4 


1529 


4 094813 


094813 homo sapien 


RC 


TISSUE-BRAIN; 


3 


8257 


99.3 


1521 


4 095710 


O95710 homo sapien 


RX 


MEDLINE-99200391; PubMed-10102268; 


4 


8095 


97.3 


1521 


11 Q9R1B9 


Q9rlb9 mus musculu 


RA 


Brose L, Bland K.S., Wang K.H., Arnott D., Henzel W., Goodman C.S. 


5 


5728 


68.9 


1523 


11 088280 


088280 rattus norv 


RA 


Tessier-Lavigne- M. , Kidd T, ; 


6 


5714.5 


68.7 


1523 


4 075094 


075094 homo sapien 


RT 


"Slit proteins bind Robo receptors and have an evolutionarily 


7 


5703 


68.6 


1523 


11 Q9WVB4 


Q9wvb4 mus musculu 


RT 


conserved role in repulsive axon guidance."; 


8 


5597 


67.3 


1534 


4 075093 


075093 homo sapien 


RL 


Cell 96:795-806(1999). 


9 


5578.5 


67.1 


1531 


11 088279 


088279 rattus norv 


DR 


EMBL; AF133270; AAD25539.1; -, 


10 


5538.5 


66.6 


1531 


11 Q9WVB5 


Q9wvb5 mus musculu 


DR 


HSSP; P00743; 1CCF. 


11 


5530 


66.5 


1025 


11 Q9Z166 


Q9zl66 mus musculu 


DR 


INTERPRO; IPR00.0152; -. 


12 


5497 


66.1 


1530 


11 Q9WUG5 


Q9wug5 rattus norv 


DR 


INTERPRO; IPR000359; -. 


13 


3934 


47.3 


796 


11 Q9WVC1 


Q9wvcl rattus norv 


DR 


INTERPRO; IPR000372; -. 


14 


3588 


43.1 


1504 


5 Q9V7F9 


Q9v7f9 drosophila 


DR 


INTERPRO; IPR000483; -. 


15 


3586 


43.1 


1504 


5 Q9XYV4 


Q9xyv4 drosophila 


DR 


INTERPRO; IPR000561; -. " 


16 


3517 


42.3 


1480 


5 Q9V7P8 


Q9v7f8 drosophila 


DR 


INTERPRO; IPR000742; -. 


17 


3071 


36.9 


850 


4 095804 


095804 homo sapien 


DR 


INTERPRO; IPR001611; -. 


18 


2703.5 


32.5 


1440 


5 Q20204 


Q20204 caenorhabdi 


DR 


INTERPRO; IPR001791; -. 


19 


2282.5 


27.4 


664 


4 Q90IL7 


Q9uil7 homo sapien 


DR 


INTERPRO; IPR001B81; -. 



20 


1318 


15.8 


333 


4 


Q9UFH5 


Q9ufh5 homo sapien 


21 


1115.5 


13.4 


530 


5 


Q24526 


Q24526 drosophila 


22 


818.5 


9.8 


2704 


5 


097458 


097458 drosophila 


23 


818 


9,8 


2634 


5 


Q9W4T8 


Q9w4t8 drosophila 


24 


790 


9. -5 


2653 


5 


Q25253 


Q25253 lucilia cup 


25 


768 


9,2 


1218 


11 


Q9QXX0 


Q9qxx0 mus musculu 


26 


768 


9.2 , 


1219 


11 


Q63722 


Q63722 rattus norv 


27 


766 


9.2 


1218 


4 


Q15816 


Q15816 homo sapien 


28 


766 


9.2 


1218 


4 


014902 


014902 homo sapien 


29 


766 


9.2 


1227 


4 


P78504 


P78504 homo sapien 


30 


761 


9.2 


1218 


4 


015122 


015122 homo sapien 


31 


757 


9.1 


2352 


5 


061240 


O61240 halocynthia 


32 


748.5 


9.0 


2447 


13 


013149 


013149 fugu rubrip 




736.5 


8. .9 


1193 


13 


An/101 n 


Q90819 gallus gall 


34 


736 


8.9 


1203 


11 


Q06008 


Q06008 mus musculu 








2470 


11 


035516 


035516 mus musculu 


36 


732 • 


n'n 

. 8.8 


2471 


11 


Q9QW30 


Q9qw30 rattus sp. 


37 


731 


' 8. B 


2531 


5 


016004 


016004 lytechinus 


38 


719.5 


8.7 


2281 


4 


Q9UPL3 


Q9upl3 homo sapien 


39 


719.5 


'8,7 


2321 


4 


Q9Y6L8 


Q9y618 homo sapien 


40 


719.5 


8,7 


2321 


4 


Q9UM47 


Q9um47 homo sapien 


41 


713.5 


8,6 


1212 


13 


042347 


042347 gallus gall 


42 


711.5 


8.6 


2319 


11 


Q9R172 


Q9rl72 rattus norv 


43 


697 


8.4 


1254 


13 


Q9YHU2 


Q9yhu2 brachydanio 


44 


691.5 


8. '3 


1964 


11 


035442 


035442 mus musculu 


45 


682.5 


8,2 


1238 


4 


Q9Y219 


Q9y219 homo sapien 



RESULT 1 
Q9Y5Q7 

ID Q9Y5Q7 PRELIMINARY; PRT; 1525 AA. 
AC Q9Y5Q7; 

DT 01-NOV-1999 (TrSMBLrel. 12, Created) 
DT 01-NOV-1999 (IrEMBLrel. 12, Last sequence update) 
01-OCT-2000 (IrEMBLrel. 15, Last annotation update) 



SLIT2. 
SLIL2. 
Homo sapiens (Human). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
NCBI.TaxID-9606; 
[1] 

SEQUENCE FROM N.A. 
TISSUE-BRAIN; . 

MEDLINE-99200389; PubMed-10102266; 

Wang K.H., Brose K., Arnott D., Kidd T., Goodman C.S., Henzel w., 
Tessier-Lavigne M. ; 

"Biochemical purification of a mammalian slit protein as a positive 
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DR INTERPRO; IPR002272; -. 

DR PFAM; PF00008; EGF; 9. 

DR PFAM; PF00054; lamlnin G; 1. 

DR PFAM; PF0056O; LRR; 18. 

DR PFAM; PF01462; LRRNT; 4. 

DR PFAM; PF01463; LRRCT; 4. 

DR PRINTS; PR01143; FSHRECEPTOR. 

DR PROSITE; PS00010; ASXJYDROXYL; UNRNOWNJ, 

DR PROSITE; PS00022; EGF.1; UNKNOWNJ, 

DR QROSITE; PS01185; CTCK 1; UNKNOWN 1. 

DR PROSITE; PS01186; EGF.2; 7. 

DR PROSITE; PS01187;^GF_CA; 2. 

DR PROSITE; PS01225; CTCKJ; 1. 

RW Glycoprotein; EGF-like domain. 

SQ SEQUENCE 1525 AA; 169394 MW; 8A81CDE34EF06A73 CRC64; 



Query Match 100.04; Score 8316; DB 4; Length 1525; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1525; Conservative 0; Mismatches 0; Indels 0; Gaps 

«1 MRGVGWQMLSLSLGLVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 
1 MRGVGWQMLSLSLGLVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 
61 DLNGNNITRITKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPE 120 
61 DLNGNNITRITRTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPE 120 
121 LLFLGTAKLYRLDLSENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 
121 LLFLGTAKLYRLDLSENQIQAIPRRAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 
181 TLNNNNITRLSVASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPS 240 
Db 181 TLNNNNITRLSVASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPS 240 
Qy 241 HLRGHNVAEVQKREFVCSDEEEGHQSFMAPSCSVLHCPAACTCSNNIVDCRGKGLTEIPT 300 
Db 241 HLRGHNVAEVQKREFVCSDEEEGHQSFMAPSCSVLHCPAACTCSNNIVDCRGKGLTEIPT 300 
301 NLPETITEIRLEQNTIKVIPPGAFSPYRKLRRIDLSNNQISEIAPDAFQGLRSLNSLVLY 360 



Qy 



Db 301 NLPETITEIRLEQNTIKVIPPGAFSPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLVLY 360 

■ Qy 361 GNKITELPKSLFEGLFSLQLLLLNANRINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTF 420 

Db 361 GNRITELPKSLFEGLFSLQLLLLNANRINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTF 420 

•421 SPLRAIQTMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKKF 480 
421 SPLRAIQTMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKKF 480 

Qy 481 RCSGTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNE 540 

Db 481 RCSGTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNE 540 

Qy 541 FTVLEATGIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKG 600 

Db 541 FTVLEATGIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKG 600 

Qy 601 LESLKTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLAN 660 

Db 601 LESLKTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLAN 660 

Qy 661 PFNCNCYLAWLGEWLRKRRIVTGNPRCQKPYFLREIPIQDVAIQDFTCDDGNDDNSCSPt 720 

Db 661 PFNCNCYLAWLGEWLRRKRIVTGNPRCQRPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPL 720 

Qy 721 SRCPTECTCLDTWRCSNKGLRVLPKGIPRDVTELYLDGNQFTLVPRELSNYKHLTLIDL 780 

Db 721 SRCPTECTCLDTWRCSNKGLRVLPRGIPRDVTELYLDGNQFTLVPRELSNYKHLTLIDL 780 



Qy 781 SNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGA 840 

III 1 1! I II HI I II I II I III II I III II llllll I II 111 III III I II III III III 
Db 781 SNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGA 840 

Qy 841 FNDLSALSHLAIGANPLYCDCNMQWLSDWVRSEYKEPGIARCAGPGEMADKLLLTTPSKK 900 

Db 841 FNDLSALSHLAIGANPLYCDCNMQWLSDWVRSEYKEPGIARCAGPGEMADKLLLTTPSKK 900 

Qy 901 FTCQGPVDVNILAKCNPCLSNPCRNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISN 960 

Db 901 FTCQGPVDVNILARCNPCLSNPCRNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISN 960 

Qy 961 PCRHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLC 1020 

Db 961 PCKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLC 1020 

Qy 1021 PPEYTGELCEERLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNR 1080 

Db 1021 PPEYTGELCEEKLDFCAQDLNPCQHDSRCILTPRGFRCDCTPGYVGEHCDIDFDDCQDNR 1080 

Qy 1081 CKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPI 1140 

Db 1081 CKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPI 1140 

Qy 1141 CQCLPGYQGEKCEKLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKD 1200 

Db 1141 CQCLPGYQGEKCEKLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYRGDKD 1200 

Qy 1201 HIAVELYR'GRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKII 1260 

Db 1201 HIAVELYRGRyRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKII 1260 

Qy 1261 TNLSKQSTLNFDSPLYVGGMPGRSWASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVP 1320 

Db 1261 TNLSKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQRVP 1320 

Qy 1321 MQTGILPGCEPCHRKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGT 1380 

Db 1321 MQTGILPGCEPCHRRVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGT 1380 

Qy 1381 CLPINAFSYSCRCLEGHGGVLCDEEEDLFNPCQAIRCRHGKCRLSGLGQPYCECSSGYTG 1440 

Db 1381 CLPINAFSYSCRCLEGHGGVLCDEEEDLFNPCQAIRCRHGRCRLSGLGQPYCECSSGYTG 1440 

Qy 1441 DSCDREISCRGERIRDYYQRQQGYAACQTTRRVSRLECRGGCAGGQCCGPLRSKRRKYSF 1500 

Db 1441 DSCDREISCRQERIRDYYQRQQGYAACQTTRRVSRLECRGGCAGGQCCGPLRSKRRRYSF 1500 

Qy 1501 ECTDGSSFVDEVEKWKCGCTRCVS 1525 

Db 1501 ECTDGSSFVDEVEKWKCGCTRCVS 1525 

RESULT 2 
094813 

ID 094813 PRELIMINARY; PRT; 1529 AA. 

AC 094813; 

DT 01-MAY-1999 (TrEMBLrel. 10, Created) 

DT 01-MAY-1999 (TrKMBLrel. 10, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE SLIT-2 PROTEIN. 

GN SLIT-2, 

OS Homo sapiens (Human). 

X Eukaryota; Metazoa; Chordata; Craniata; vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID-9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINB-99033071; PubMed-9813312; 

RA Itoh A., Miyabayashi T. , Ohno M, , Sakano S.; 

RT "Cloning and expressions of three mammalian homologues of Drosophila 

RT slit suggest possible roles for Slit in the formation and maintenance 

RT of the nervous system,"; 
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RL Brain Res. Mol. Brain Res. 62:175-186(1998). 

DR EMBL; AB017168; BAA35185.1; -. 

DR HSSP; P00743; 1CCF. 

DR INTERPRO; IPR000152; -. 

DR INTERPRO; IPR000359; -. 

DR INTERPRO; IPR000372; -. 

DR INTERPRO; IPR000483; -, 

DR INTERPRO; IPR000561; -. 

DR INTERPRO; IPR00Q742; -. 

DR INTERPRO; IPR001611; ■. 

DR INTERPRO; IPR001791; 

DR INTERPRO; IPR001881; -. 

DR INTERPRO; IPR0Q2272; -. 

DR PFAM; PF00008; EGF; 9. 

DR PFA$; PF00054; laminin.G; 1. 

DR PFAM; PF00560; LRR; 18. 

DR PFAM; PF01462; LRRNT; 4. 

•PFAM; PF01463; LRRCT; 4. 
PRINTS; PR01143; FSHRECEPTOR. 
PROSITE; PS00010; ASXJYDROXYL; UNKNOWN.2. 

DR PROSITE; PS00022; EGF.l; UNKNOWN J. 

DR PROSITE; PS01185; CTCU; UNKNOWN J.. 

DR PROSITE; PS01186; EGF_2; 7, 

DR PROSITE; PS0U87; EGF_CA; 2, 

DR PROSITE; PS01225; CTCU; 1. 

KW Glycoprotein; EGF- like domain. 

SO. SEQUENCE 1529 AA; 169868 MW; 5D19CC5E7FD461BA CRC64; 



Query Match 99.4%; Score 8265; DB 4; Length 1529; 

Best Local Similarity 99.2%; Pred. No. 0; 

i 1520; Conservative 1; Mismatches 0; Indels 12; Gaps 2; 



Qy 1 MRGVGWQMLSLSWLVLAILNKVAPQACPAQCSCSGSTVDCHGIALRSVPRNIPRNTERL 60 

Db 1 MRGVGWQMLSLSLGLVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 

Qy 61 DLNGNNITRITKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPE 120 

Db 61 DLNGNNITRITKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPE 120 

Qy 121 LLFLGTAKLYRLDLSENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 

Db 121 LLFLGTAKLYRLDLSENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 

•181 TLNNNNITRLSVASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPS 240 
1 181 TLNNNNITRLSVASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRQRPRVGLYTQCMGPS 240 

Qy 241 HLRGHNVAEVQKREFVCSDEEEGHQSFMAPSCSVLHCPAACTCSNNIVDCRGKGLTEIPT 300 

Db 241 HLRGHNVAEVQKREFVCS — GHQSFMAPSCSVLHCPAACTCSNNIVDCRGKGLTEIPT 296 

Qy 301 NLPETITEIRLEQNTIKVIPPGAFSPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLVLY 360 

Db 297 NLPETITEIRLEQNTIKVIPPGAFSPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLVLY 356 

Qy 361 GNKITELPKSLFEGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTF 420 

Db 357 GNKITELPKSLFEGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTF 416 

Qy 421 SPLRAIQTMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKKF 480 

Db 417 SPLRAIQTMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKKF 476 

Qy 481 RCS GTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTA 532 

Db 477 RCSAKEQYFIPGTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTA 536 

Qy 533 ELRLNNNEFTVLEATGIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLEN 592 

Db 537 ELRLNNNEFTVLEATGIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLEN 596 



Qy 593 VQHKMFKGLESLKTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSL 652 

Db 597 VQHKMGLESLKTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSL 656 

Qy 653 STLNLLANPFNCNCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGN 712 

Db 657 STLNLLANPF'NCNCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGN 716 

Qy 713 DDNSCSPLSRCPTECTCLDTWRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNY 772 

Db 717 DDNSCSPLSRCPTECTCLDTVVRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNY 776 

Qy 773 KHLTLIDLSNNRISTISNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGND 832 

Db 777 KHLTLIDLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGND 836 

Qy 833 ISWPEGAFN.DLSALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKL 892 

Db 837 ISWPEGAFNDLSALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKL 896 

Qy 893 LLTTPSKKFTCQGPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDV 952 

Db 897 LLTTPSKKFTCQGPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDV 956 

Qy 953 PIHACISNPCKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDG 1012 

Db 957 PIHACISNPCKHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDG 1016 

Qy 1013 INNYTCLCPPEYTGELCEEKLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDID 1072 

Db 1017 INNYTCLCPPEYTGELCEEKLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDID 1076 

Qy 1073 FDDCQDNKCKNGABCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQC 1132 

Db 1077 FDDCQDNKCKNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQC 1136 

Qy 1133 IVRINEPICQCLPGYQGEKCEKLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGI 1192 

Db 1137 IVRINEPICQCLPGYQGEKCEKLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGI 1196 

Qy 1193 LLYKGDKDHIAVELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSV 1252 

Db 1197 LLYKGDKDHIAVELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSV 1256 

Qy 1253 DGGNPKIITN1SKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSE 1312 

Db 1257 DGGNPKIITNLSKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSE 1316 

Qy 1313 LQDFQKVPMQTGILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCL 1372 

Db 1317 LQDFQKVPMQTGILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCL 1376 

Qy 1373 GNKCVHGTCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGOLSGLGQPYC 1432 

Db 1377 GNKCVHGTCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYC 1436 

Qy 1433 ECSSGYTGDSCDREISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLR 1492 

Db 1437 ECSSGYTGDSCDREISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLR 1496 

Qy 1493 SKRRKYSFECTDGSSFVDEVEKWKCGCTRCVS 1525 

Db 1497 SKRRKYSFECTDGSSFVDEVEKWKCGCTRCVS 1529 



RESULT 3 
095710 

ID 095710 PRELIMINARY; PRT; 1521 AA. 
AC 095710; 

DT 01-MAY-1999 (TrEMBLrel. 10, Created) 
DT 01-MAY-1999 (TrEMBLrel. 10, Last sequence update) 
DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
DE NEUROGENIC EXTRACELLULAR SLIT PROTEIN SLIT2. 
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Homo sapiens (Human). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
NCBI_TaxID-9606; 
[1] 

SEQUENCE FROM N.A, 
TISSDE-KIDNEY, BRAIN; 
MEDLINE-99279238; PuhJted-10349621; 

Holmes G.P., Negus K. , Burridge L., Raman S., Algar E., Yamada T,, 
Little M.H.; 

"Distinct but overlapping expression patterns of two vertebrate slit 

homologs implies functional roles in CNS development and 

organogenesis."; 

Mech. Dev. 79:57-72(1998). 

EMBL; AF055585; AAD04309.1; -. 

HSSP; P00743; 1CCP. 

IPR000152; 
IPR000359; ■ 
INTERPRO; IPROQ0372; 
INTERPRO; IPR000483; ■ 
INTERPRO; IPR000561; ■ 
INTERPRO; IPR000742; ■ 
INTERPRO; IPR001611; ■ 
INTERPRO; IPR001791; ■ 
INTERPRO; IPR001881; ■ 
INTERPRO; IPR002272; ■ 
PFAM; PF00008; EGF; 9. 
PFAM; PF00054; laminin.G; 1. 
PFAM; PF00560; LRR; 18. 
PFAM; PF01462; LRRNT; 4. 
PFAM; PF01463; LRRCT; 4. 
PRINTS; PR01143; FSHRECEPTOR. 
PROSITE; PS00010; ASXJYDROXYL; ONRNOWNJ. 
PROSITE; PS00022; EGF_1; ONKNOWNj. 
PROSITE; PS01185; CTCR.l; UNKNOWN.l. 
PROSITE; PS01186; EGF.2; 7. 
PROSITE; PS01187; EGF CA; 2, 
PROSITE; PS01225; CTCK.2; 1. 
Glycoprotein; EGF-like domain. 

1521 AA; 168947 MW; C05A0DF7D78C48C9 CRC64; 



Query Match ' 99.34; Score 8257; DB 4; Length 1521; 

Best Local Similarity 99.3%; Pred. No. 0; 

Matches 1515; '■ Conservative 3; Mismatches 3; Indels 4; Gaps 

Qy 1 MRGVGtiQMLSLSLGLVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 

Db 1 MRGVGWQMLSLSLGLVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 

Q 61 DLNGNNITRITKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPE 120 

Db * 61 DLNGNNITRITKTDFAGLRHLRVLQLMENRISTIERGAFQDLKELERLRLNRNHLQLFPE 120 

Qy 121 LLFLGTAKLYRLDLSENQIQAIPRKAFRGAVDIRNLQLDYNQISCIEDGAFRALRDLEVL 180 

Db 121 LLFLGTAKLYRLDLSENQIQAIPRRAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 

Qy 181 TLNNNNITRLSVASFNHMPKLRTFRLHSNNLYCDCHIAWLSDWLRKRPRVGLYTQCMGPS 240 

Db 181 TLNNNNITRLSVASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRQRPRVGLYTQCMGPS 240 

Qy 241 HLRGHNVAEVQRREFVCSDEEEGHQSFMAPSCSVLHCPAACTCSNNIVDCRGRGLTEIPT 300 

Db 241 HLRGHNVAEVQKREFVCS GHQSFMAPSCSVLHCPAACTCSNNIVDCRGRGLTEIPT 296 

Qy 301 NLPETITEIRLEQNTIRVIPPGAFSPYRKLRRIDLSNNQISELAPDAFQGLRSLNSLVLY 360 

Db 297 NLPETITEIRLEQNTIRVIPPGAFSPYRRLRRIDLSNNQISELAPDAFQGLRSLNSLVLY 356 

Qy 361 GNRITELPKSLFEGLFSLQLLLLNANRINCLRVDAFQDLHNLNLLSLYDNRLQTIAKGTF 420 

Db 357 GNKITELPKSLFEGLFSLQLLLLNANRINCLRVDAFQDLHNLNLLSLYDNRLQTIARGTF 416 



Qy 421 SPLRAIQTMH1AQNPFICDCHLRWLADYLHTNPIETSGARCTSPRRLANKRIGQIRSKKF 480 

Db 417 SPLRAIQTMHLAQNPFICDCHLRWLADYLHTNPIETSGARCTSPRRLANKRIGQIRSKRF 476 

Qy 481 RCSGTEDYRSKLSGDCFADLACPERCRCEGTTVDCSNQKLNRIPEHIPQYTAELRLNNNE 540 

Db 477 RC SGTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNK I PEH I PQYTAELRLNNNE 536 

Qy 541 FTVLEATGIFRKLPQLRKINFSNNRITDIEEGAFEGASGVNEILLTSNRLENVQHRMFKG 600 

Db 537 FTVLEATGIFRKLPQLRKINFSNNRITDIEEGAFEGASGVNEILLTSNRLENVQHKMFRG 596 

Qy 601 LESLRTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLAN 660 

II : IIIIIIMIIIIIIIIIIIIIIhllllllllllllllllllllllllllllll 

Db 597 LERPQNLMLRSNRITCVGNDSFIGLSSVRMLSLYDNQITTVAPGAFDTLHSLSTLNLLAN 656 

Qy 661 PFNCNCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPL 720 

Db 657 PFNCNCYLAWLGEWLRKKRIVTGNPRCQKPYFLREIPIQDVAIQDFTCDDGNDDNSCSPL 716 

Qy 721 SRCPTECTCLDTWRCSNKGLRVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLIDL 780 

Db 717 SRCPTECTCLDTWRCSNKGLRVLPRGIPRDVTELYLDGNQFTLVPRELSNYRHLTLIDL 776 

Qy 781 SNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLRSLRLLSLHGNDISWPEGA 840 

Db 777 SNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGA 836 

Qy 841 FNDLSALSHLAIGANPLYCDCNMQWISDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKK 900 

Db 837 FNDLSALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKK 896 

Qy 901 FTCQGPVDVNILARCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFRGQDCDVPIHACISN 960 

Db 897 FTCQGPVDVNILARCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFRGQDCDVPIHACISN 956 

Qy 961 PCRHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLC 1020 

Db * 957 PCRHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLC 1016 

Qy 1021 PPEYTGELCEEKLDFCAQDLNPCQHDSKC I LTPKGFKCDCTPG YVGEHCDI DFDDCQDNK 1080 

Db 1017 PPEYTGELCEERLDFCAQDLNPCQHDSKCILTPRGFKCDCTPGYVGEHCDIDFDDCQDNK 1076 

Qy 1081 CRNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPI 1140 

Db 1077 CRNGAHCTDAVHGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPI 1136 

Qy 1141 CQCLPGYQGERCERLVSVNFINRESYLQIPSARVRPQTNITLQIATDEDSGILLYKGDRD 1200 

Db 1137 CQCLPGYQGEKCERLVSVNFINKESYLQIPSARVRPQTNITLQIATDEDSGILLYRGDKD 1196 

Qy 1201 HIAVELYRGRmSYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPRII 1260 

Db 1197 HIAVELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPRII 1256 

Qy 1261 TNLSRQSTLNI'DSPLYVGGMPGRSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVP 1320 

Db 1257 TNLSRQSTLNFDSPLYVGGMPGRSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVP 1316 

Qy 1321 MQTGILPGCEPCHRKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGT 1380 

Db 1317 MQTGILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVBGT 1376 

Qy 1381 CLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSGYTG 1440 

Db 1377 CLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSGYTG 1436 

Qy 1441 DSCDREISCRGERIRDYYQRQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSF 1500 

Db 1437 DSCDREISCRGERIRDYYQRQQGYAACQTTRKVSRLECRGGCAGGQCCGPLRSRRRKYSF 1496 



Best Available Copy 

Mon Jan 22 13:04:55 2001 us-09-540-245a-2.rspt Page 5 



Qy 1501 ECTDGSSFVDEVEKWKCGCTRCVS 1525 

1 1 1 1 1 1 1 ) M I M ! 1 1 1 1 1 ! 1 1 1 J I 

Db 1497 ECTDGSSFVDEVEKWKCGCTRCVS 1521 



PRT; 1521 AA, 



RESULT 4 
Q9R1B9 

ID Q9R1B9 PRELIMINARY; 

AC Q9R1B9; 

DT 01-MAY-2QQ0 (TrEMBLrel. 13, Created) 

DT OI-may-2000 (TrEMBLrel. 13, Last sequence update) 

DT 81-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE SLIT2. 

GN SLIT2, 

OS Mus musculus (Mouse). 

OC Eukaryota; Metazoa; Chordata; Cranlata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

• NCBIJaxID-10090; 
[1] 
SEQUENCE FROM N,A, 

RC STRAIN-SWISS WEBSTER/ICR; 

RA Yuan w,, Zhou L. ( Chen J.-h., Wu J,Y,, Rao Y., Ornitz D.M.; 

RT "The mouse SLIT family: Secreted ligands for Robo expressed in 

RT patterns that suggest a role in morphogenesis and axon guidance,"; 

RL Dev. Biol. 0:0-0(1999). 

DR EMBL; AF144628; AAD44759.1; -. 

DR HSSP; P00743; 1CCF. 

DR MGD; MGI: 1315205; SUM. 

DR INTERPRO; IPR000152; -. 

DR INTERPRO; IPR000359; -. 

DR INTERPRO; IPR000372; -. 

DR INTERPRO; IPR000483; 

DR INTERPRO; IPR000561; 

DR INTERPRO; IPR000742; 

DR INTERPRO; IPR001611; 

DR INTERPRO; IPR001791; 

DR INTERPRO; IPR001881; 

DR INTERPRO; IPR002272; 

DR PFAM; PF00008; EGF; 9. 

DR PFAM; PF00054; laminin.G; 1. 

DR PFAM; PF00560; LRR; 18. 

DR PFAM; PF01462; LRRNT; 4. 

DR PFAM; PF01463; LRRCT; 4, 

DR PRINTS; PR00019; LEURICHRPT. 

DR PRINTS; PR01143; FSHRECEPTOR. 

•PROSITE; PS00010; ASXJYDROXYL; UNKNOWNJ. 
PROSITE; PS00022; EGF_1; UNKNOWN.9. 
PROSITE; PS01185; CTCK 1; UNKNOWN 1. 

DR PROSITE; PS01186; EGF J; 7. 

DR PROSITE; PS01187; EGF CA; 2. 

SQ SEQUENCE 1521 AA; 168769 MW; 97DCA361578978E4 CRC64; 



Query Match 97.3%; Score 8095; DB 11; Length 1521; 

Best Local Similarity 96.51; Pred. No. 0; 

Conservative 34; Mismatches 16; indels 4; 



I hill IMMMMMIIIMIMMIIMMIIMMIIMIMIMMIIMIM 



!llllll!ll!llll!!ll!!lllll!l|:|||||||||||||lllllllll|:|l!ll! 



Illlllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



IIIIIMIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIhllllllllMIIII 



Matches 


oy 


1 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 



Qy 


241 


Db 


241 


Qy 


301 


Db 


297 


Qy 


361 


Db 


357 


Qy 


421 


Db 


417 


Qy 


481 


Db 


477 


Qy 


541 


Db 


537 


Qy 


601 


Db 


597 


Qy 


661 


Db 


657 


Qy 


721 


Db 


717 


Qy 


781 


Db 


777 


Qy 


841 


Db 


837 


Qy 


901 


Db 


897 


Qy 


961 


Db 


957 


Qy 


1021 


Db 


1017 


Qy 


1081 


Db 


1077 


Qy 


1141 


Db 


1137 


Qy 


1201 


Db 


1197 


Qy 


1261 


Db 


1257 


Qy 


1321 



iiiiiiiiiijiiiiim iiiiiiiiiiiiiiiiiiiiiiiiiimmiiiin 



llllllllinilhhllllllllllllllhllllllllllllllllllllllllll! 



IMIMIMMMMIMIMIIMIMMIMIIMMIIMIIMIMMIhlMII 



iiiiiiiijiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 



IMMIIIIIHIMIMIIMMIMIIMMIIMIilMlhllllillllMIMI 



lllllllll!lllllllll!IIIIIM!ll!!!!IIU!llllll!!lllll!lllllll 



llllllllirillMIIIIIIMI IIIIIIMIIIIIIIIIMIhlllllMIIIII 



IIMIhjirillllhllllllMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 



IMMIIMMIIMIMIMIMIMIMIMIMIMIMIMMIMMIIMIMI 



IMIIIIIII-IIIIIIIIIMIIIIIIIIIIMIIIIIIIIMIIIIMIIIIIIIIMI 



IIIIMII: I ll!llll!lll!ll!l!l:|llllllllllllimillll!ll!!ll 



I MM IIMII II! Mil II III III! II I hill I III III III MMII III III 



IIIIIIIMmilllllllMIIIMIIIIIMIIIIIIIIIhlllllllMMIII! 



IIIMMIMMMMilMMMIIMMIIMMIMIIMIIMMIIIhllllll 



IIMII MlllllllllhllllllllllllllllllllllMIIIIIIMIIIIIII 



IIIMMIIiMIIIMMMIMMMMIMIIMIMM II 1 1 ■■ I i 1 1 1 1 : 1 : ■ 



l!IIIIMIMMIIIIIIIIII:l!l!IMI!IIIIM!IM!MIMII!llll:|:| 



1321 MQTGILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGT 1380 
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iimiiiiiimimi! iiiiihiiiiihiiiiiiimiimimiiim 

Db 1317 MQTGILPGCEPCHKKVCAHGMCQPSSQSGFTCBCEEGWMGPLCDQRTNDPCLGNKCVHGT 1376 

Qy 1381 CLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSGYTG 1440 
lllllllllllllllllllllllllllllllll , 
Db 1377 CLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQMIKCKHGKCRLSGVGQPYCECNSGFTG 141 

Qy 1441 DSCDREISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSF 1500 

IIIIIIIIIIIIIII1IIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIII 
Db 1437 DSCDREISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSF 1496 

Oy 1501 ECTDGSSFVDEVEKWKCGCTRCVS 1525 

1 1 r m i j 1 1 1 r 1 1 m 1 1 1 1 1 I! i 

Db 1497 ECTDGSSFVDEVERWRCGCARCAS 1521 



RESULT 1 5 
088280 
ID 088280 
AC 088280; 
DT 



PRELIMINARY; 



PRT; 1523 AA. 



01-NOV-1998 (TrEMBLrel. 08, Created) 

At 01-NOV-1998 (TrEMBLrel. 08, Last sequence update) 

■ 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

We MEGF5. 

I GN MEGF5. 

I OS Rattus norvegicus (Rat). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-SPRAGUE-DAWLEY; TISSUE-BRAIN; 

RX MEDLINE-98360089; PubMed-9693030; 

RA Nakayama M. , Nakajima D., Nagase T., Nomura N., Seki N., Ohara O.; 

rt "identification of high-molecular-weight proteins with multiple EGF- 

RT like motifs by raotif-trap screening,"; 

RL Genomics 51:27-34(1998). 

DR EMBL; AB011531; BAA32461.1; ■. 

DR HSSP; P01132; 1EGF. 

DR INTERPRO; IPR000152; -. 

DR INTERPRO; IPR000359; -. 

DR INTERPRO; IPR000372; -. 

DR INTERPRO; IPR000483; -. 

DR INTERPRO; IPR000561; -. 

DR INTERPRO; IPR000742; -. 

DR INTERPRO; IPR001611; -. 

DR INTERPRO; IPR001791; -. 

DR INTERPRO; IPR001881; ■. 

DR INTERPRO; IPR002400; -. 
PFAM; PF00007; Cysjnot; 1. 
PFAM; PF00008; EGF; 9. 
PFAM; PF00054; laminin.G; 1. 

DR PFAM; PF00560; LRR; 19. 

DR PFAM; PF01462; LRRNT; 4. 

DR PFAM; PF01463; LRRCT; 4. 

DR PRINTS; PR00438; GFCYSRNOT. 

DR PROSITE; PS00010; ASXJYDRQXYL; 0NRNOWNJ. 

DR PROSITE; PS00022; EGF_1 ; ONKNOWNJ. 

DR PROSITE; PS01185; CTCKJ; UNRNOWN_l. 

DR PROSITE; PS01186; EGF 2; 7. 

DR PROSITE; PS01187; EGF_CA; 2. 

DR PROSITE; PS01225; CTCR_2; 1. 

KW Glycoprotein; EGF-like domain. 

SQ SEQUENCE 1523 AA; 167766 MW; 6CE1B7AF9244478E CRC64; 



Query Match 68.9*; Score 5728; DB 11; Length 1523; 

Best Local Similarity 66.9*; Pred. No. 0; 

Matches 1015; Conservative 223; Mismatches 265; Indels 14; Gaps 9; 

Qy 11 LSLGLVLA-ILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITR 69 

1:1 I II II: III :|:|| ::|||||| Ihlll Mil MM: III 
Db 16 LALALALASILSGPPAAACPTKCTCSAASVDCHGLGLRAVPRGIPRNAERLDLDRNNITR 75 



Qy 70 ITKTDFAGLR'HLRVlQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPELLFLGTAKL 129 

III II IM'MIII I :|::l 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 : 1 II: Mill I II 
Db 76 ITKMDFTGLKNLRVLHLEDNQVSVIERGAFQDLKQLERLRLNKNKLQVLPELLFQSTPKL 135 

Qy 130 YRLDISENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITR 189 

lllllllll! lllllll :|||||| I lllllllllllllll|:|||llll|:| 
Db 136 TRLDLSENQIQGIPRKAFRGVTGVKNLQLDNNHISCIEDGAFRALRDLEILTLNNNNISR 195 

Qy 190 .LSVASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPSHLRGHNVAE 249 

: I lllllitMl llllhllllllllllllllhl :l :| II I Mil :||: 
Db 196 ILVTSFNHMPKIRTLRLHSNHLYCDCHLAWLSDWLRQRRTIGQFTLCMAPVHLRGFSVAD 255 

Qy 250 VQKREFVCSDEEEGHQSFMAPSCSV--LHCPAACTCSNNIVDCRGKGLTEIPTNLPETIT 307 

111:1:11 '; I I 1:1: I 1 1 : 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 llll I 
Db 256 VQRKEYVC - - - PGPHS • EAPACNANSLSCPSACSCSNNIVDCRGKGLT EIP ANLPEG IV 310 

Qy 308 EIRLEQNTIRVIPPGAFSPYRRLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNRITEL 367 

llllllhll II III hhllhl llll::IMIIIII:|i lllllllllll: 
Db 311 EIRLEQNSIRCIPAGAFIQYRRLKRIDISKNQISDIAPDAFQGLRSLTSLVLYGNRITEI 370 

Qy 368 PRSLFEGLFSLQLLLLNANRINCLRVDAFQDLHNLNLLSLYDNRLQTIAKGTFSPLRAIQ 427 

II 11:11 Mllllllllllllll: llll llllllllllllllhll |:l|::ll 

Db 371 PKGLFDGLVSLQLLLLNANKINCLRVNTFQDLQNLNLLSLYDNRLQTISKGLFAPLQSIQ 430 

Qy 428 TMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANRRIGQIRSKRFRCSGTED 487 

hllllllhhllllllhll MMMMIIMIMMIMI lllllllllhll 
Db 431 TLHLAQNPFVCDCHLKWLADYLQDNPIETSGARCSSPRRLANKRISQIRSRRFRCSGSED 490 

Qy 488 YRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNRIPEHIPQYTAELRLNNNEFTVLEAT 547 

II:: I :|l' II llllllllll lllllll|::|l MMM :|lll:|: lllll 
Db 491 YRNRFSSECFMDLVCPEKCRCEGTIVDCSNQRLSRIPSHLPEYTTDLRLNDNDIAVLEAT 550 

Qy 548 GIFKKLPQLRKINFSNNRITDIEEGAFEGASGVNEILLTSNRLENVQHRMFRGLESLKTL 607 

lllllll. lllll 111:1 :: HUM! |'::|| hi : :||:|| Mil 
Db 551 GIFKRLPNLRKINLSNNRIREVREGAFDGAAGVQELMLTGNQLETMHGRMFRGLSGLKTL 610 

Qy 608 MLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCY 667 

llll! 1:11.11:1 lll!lllllllll:lll::llll II 1 1 1 1 : 1 1 1 : 1 1 1 1 1 1 1 : 
Db 611 MLRSNLISCVHNDTFAGLSSVRLLSLYDNRITTISPGAFTTLVSLSTINLLSNPFNCNCH 670 

Qy 668 LAWLGEWLRKKRIVTGNPRCQKPYFLREIPIQDVAIQDFTCDDGNDDNSCSPLSRCPTEC 727 

: 1 1 1 1 ll!|:lll:|llllll|:|l|IIIMIIMIIIII :||::||| III :| 
Db 671 MAWLGRWLRKRRIVSGNPRCQKPFFLKEIPIQDVAIQDFTC-EGNEENSCQLSPRCPEQC 729 

Qy 728 TCLDTWRC SMKGLKVLPKG I PRDVT ELYLDG NQFTLVPKELSNYKHLT LIDLSNNRIST 787 

|::IMIMI:| 1 1 1 1 : 1 : 1 1 1 1 1 1 1 : 1 1 I MINI :: llllllli II 
Db 730 TCVETWRCSNRGLHTLPKGMPKDVTELYLEGNHLTAVPRELSTFRQLTLIDLSNNSISM 789 

Qy 788 LSNQSFSNMTOLLTLILSYNRLRCIPPRTFDGLRSLRLLSLHGNDISVVPEGAFNDLSAL 847 

1:1 :MII::I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 : 1 1 1 : 1 : 1 1 1 1 1 1 1 llll:|lll::| 
Db 790 LTNHTFSNMSELSTLILSYNRLRCIPVHAFNGLRSLRVLTLHGNDISSVPEGSFNDLTSL 849 

Qy 848 SHLAIGANPLl'CDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKRFTCQGPV 907 

111:1 lll:l!l:::|l|:|:|: lllllllll: I Ihllhlh :| hill 
Db 850 SHLALGINPL'ECDCSLRWLSEWIRAGYREPGIARCSSPESMADRLLLTTPTHRFQCKGPV 909 

Qy 908 DVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGT 967 

hlhill 111:1111:111: III: lllllll :||:|| III: |: 1 1 1 : 1 i 1 1 
Db 910 DINIVAKCNACLSSPCRNNGTCSQDPVEQYRCTCPYSYKGRDCTVPINTCVQNPCQHGGT 969 

Qy 968 CHLREGEEDGE'WCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGE 1027 

III I IN I I Nil: 11:1 llllllllll::|llllllll hill llll 

Db 970 CHLSESHRDGFSCSCPLGFEGQRCEINPDDCEDNDCENSATCVDGINNYACVCPPNYTGE 1029 

Qy 1028 LCEEKLDFCAQDLNPCQHDSRCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHC 1087 

1:1 :hl :: 1 1 1 :: 1 1 1 Ihh Ml h h I Ml :||::|| I 
Db 1030 LCDEVIDYCyPEMNLCQHEAKCISLDKGFRCECVPGYSGKLCETDNDDCVAHKCRHGAQC 1089 

Qy 1088 TDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGY 1147 

IIIIIIIIIIMMIMIII lllll :MMI -lllllllll II Ml IM 
Db 1090 VDAVNGYTCICPQGFSGLFCEHPPPMVLLQTSPCDQYECQNGAQCIWQQEPTCRCPPGF 1149 
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0y 1148 QGEKCEKLVSVNFINKESYWIPSAmPQTNITLQIATDEDSGILLYKGDKDHIAVELY 1207 

I |:||::: MUM 1 1 : 1 1 : 1 1 1 : 1 : 1 1 1 1 1 1 1 1 I MM! 

Db 1150 AGPRCEKLITVNFVGKDSYVELASAKVRPQANISLQVATDKDNGILLYKGDNDPLALELY 1209 

Qy 1208 RGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLSRQS 1267 

M I! ||: I I : MMMMII M III: MMM Mill: III 
Db 1210 QGHVRLVYDSLSSPPTTVYSVETVNDGQFHSVELVMLNQILNLWDKGAPKSLGKLQKQP 1269 

Qy 1268 TLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSPHGCIRNLYINSELQDFQKVPMOT-GIL 1326 

: : 1 1 ! 1 : 1 1 : 1 : :::||| : Hill : 1 1 : ! 1 1 1 1 : M |: |: 
Db 1270 AVGINSPLYLGGIPTSTGLSALRQGADRPLGGFHGCIHEVRINNELQDFKALPPOSLGVS 1329 

Qy 1327 PGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGTCLPINA 1386 

Ml: I II II I: : III II HUM Mill: I MM 
Db 1330 PGCKSC - - TVCRHGLCESVEKDSWCECHPGWTGPLCDQEAQDPCLGHSCSHGTCV- ATG 1386 

Qy 1387 FSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSGYTGDSCDRE 1446 

• II Ml II: I III:: I I I I II MM M MM I MM: MM 
1387 NSYVCKCAEGYEGPLCDQKNDSANACSAFKCHHGQCHISDRGEPYCLCQPGFSGNHCEQE 1446 

',. Qy 1447 ISCRGERIRDYYQKQQGYMCQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSFECTDGS 1506 
Ml :h ::|: MM I II : HIM I III |:||||||| MMM 
Db 1447 NPCLGEIVREAIRRQKDYASCATASKVPIMVCRGGC-GSQCCQPIRSKRRKYVFQCTDGS 1505 

Qy 1507 SFVDEVEKWKCGCTRC 1523 

MMMM ::|| | 
Db 1506 SFVEEVERHLECGCREC 1522 



RESULT 6 
075094 
ID 075094 
075094; 

01-NOV-1998 (TrEMBLrel, 08, Created) 
01-AUG-1999 (TrEMBLrel. 11, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
SLIT -3 PROTEIN (MEGF5). 
SLIT -3 OR MEGF5. 
Homo sapiens (Human). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
NCBI_TaxID»9606; 
[1] 

SEQUENCE FROM N.A. 
MEDLXNB-9903 3071; PubMed-9813312; 
Itoh A., Miyabayashi T., Ohno M., Sakano S.; 
"Cloning and expressions of three mammalian homologues of Drosophila 
slit suggest possible roles for Slit in the formation and maintenance 
of the nervous system."; 
Brain Res. Mol. Brain Res. 62:175-186(1998). 
[2] 

SEQUENCE OF 785-1523 FROM N.A. 
TISSUE=BRAIN; 

MEDLINE-98360089; PubMed-9693030; 

Nakayama M., Nakajima D., Nagase T, , Nomura N., Seki N., Ohara O.; 
"Identification of high-molecular -weight proteins with multiple EGF- 
}ike motifs by motif -trap screening."; 
Genomics 51:27-34(1998). 
EMBL; AB017169; &1a35186,1; -. 
EMBL; AB011538; BAA32466.1; -. 
HSSP; P01132; 1EGF. 
INTERPRO; IPR000152; 



PRELIMINARY; PRT; 1523 AA. 



RX 



RT 



IPR000359; 
IPR000372; 
INTERPRO; IPR000483; 
INTERPRO; IPR000561; 
INTERPRO; IPR000742; 
INTERPRO; IPR001611; 
INTERPRO; IPR001791; 
INTERPRO; IPRO01881; 
INTERPRO; IPR002049; 
INTERPRO; IPRQ03006; 
PFAM; PF00007; Cysjtnot; 1. 



PFAM; PF00008; EGF; 9, 

PFAM; PF00054; laminin.G; 1, 

PFAM; PF00560; LRR; 19. 

PFAM; PF01462; LRRNT; 4. 

PFAM; PF01463; LRRCT; 4. 

PRINTS; PR0001I; EGFLAMININ. 

PRINTS; PR00019; LEURICHRPT. 

PROSITE; PS00010; ASXJYDROXYL; UNKNOWNJ. 

PROSITE; PSQQQ22} EGF.l; UNKNOWNJ. 

PROSITE; PS00290; IGJHC; ONKNOWNJ. 

PROSITE; PS01185; CTCKJ; UNKNOWNJ. 

PROSITE; PS01185; EGF_2; 7. 

PROSITE; PS01187; EGF.CA; 2. 

PROSITE; PS01225; CTCKJ; 2. 

Glycoprotein; EGF-like domain. 

SEQUENCE 1523, AA; 167684 MW; 52549D41D1D6DBD1 CRC64; 



Query Match ' 68.7%; Score 5714.5; DB 4; Length 1523; 

Best Local Similarity 66.6*; Pred, No. 0; 

Matches 1017; Conservative 216; Mismatches 278; Indels 15; Gaps 

Qy 3 GVGWQMLS-LSLGLVLA-ILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERL 60 

III : : M I || :|: III MMI ::|||||| MMM MM III 
Db 7 GVGAAVRARLALALALASVLSGPPAVACPTKCTCSAASVDCHGLGLRAVPRGIPRNAERL 66 

Qy 61 DLNGNNITRITKTDFAGLRHLRVLQLMElfKISTIERGAFQDLKELERLRLNRNHLQLFPE 120 

II: II 1 1 1 1 1 1 |||||::|||| I :MM 1 1 1 1 1 1 1 1 1 1 : 1 M 1 1 1 1 : 1 II: II 
Db 67 DLDRNNITRITKMDFAGLKNLRVLHLEDNQVSVIERGAFQDLKQLERLRLNKNKLQVLPE 126 

Qy 121 LLFLGTAKLYRLDLSENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 
III I II 1 1 i M 1 1 1 1 MMM 1:111111 I IIIMIHIMIIIII:! 

Db 127 LLFQSTPKLTRLDLSENQIQGIPRKAFRGITDVKNLQLDNNHISCIEDGAFRALRDLEIL 186 

Qy 181 TLNNNNITRLSVASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPS 240 

1111111:1: I 1111111:11 1 1 1 1 1 : 1 1 1 ! 1 1 1 1 1 1 II I II : I II M II I 
Db 187 TLNNNNISRILVTSFNHMPKIRTLRLHSNHLYCDCHLAWLSDWLRQRRTVGQFTLCMAPV 246 

Qy 241 HLRGHNVAEVQKREFVCSDEEEGHQSFMAPSCSV-LHCPAACTCSNNIVDCRGKGLTEI 298 

MM IM:I1I:|:|I MM : II: IMMIMIMMII II 

Db 247 HLRGFNVADVQKKEYVCPAPHS EPPSCNANSISCPSPCTCSNNIVDCRGKGLMEI 301 

Qy 299 PTNLPETITEIRLEQNTIKVIPPGAFSPYKKLRRIDLSNNQISELAPDAFQGLRSLNSLV 358 

I Mil I MMMMM II III: 1 1 i 1 : 1 1 1 : 1 I i 1 1 :: 1 1 1 1 1 11 1 : 1 1 Ml 
Db 302 PANLPEGIVEIRLEQNSIKAIPAGAFTQYKKLKRIDISKNQISDIAPDAFQGLKSLTSLV 361 

Qy 359 LYGNKITELPKSLFEGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKG 418 

MMMM: I, |M 1 1 11 ! I M 1 1 1 1 1 M I : MM 1 1 1 1 1 1 1 1 1 M I II 1 : 1 1 
Db 362 LYGNKITEIAKGLFDGLVSLQLLLLNANKINCLRVNTFQDLQNLNLLSLYDNKLQTISKG 421 

Qy 419 TFSPLRAIQTMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSK 478 

|:M::IM:|||||||:IMMIMIMI M 1 1 1 1 1 II 1 : 1 1 1 1 1 1 1 II I Mill 
Db 422 LFAPLQSIQT LHLAQ NPFVCDCHLKWLAD YLQDNP IET SGARCSSPRRLANKRISQ I KSK 481 

Qy 479 KFRCSGTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNN 538 

MMMMMIM I M II MMMMM MMMM Ml MM MMM 

Db 482 KFRCSGSEDYRSRFSSECFMDLVCPEKCRCEGTIVDCSNQKLVRIPSHLPEYVTDLRLND 541 

Qy 539 NEFTVLEATGIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMF 598 

M MMMMMIM Mill Mill :: i 1 1 1 : 1 1 : I MMI Mil I ::| 
Db 542 NEVSVLEATGIFKKLPNLRKINLSNNKIKEVREGAFDGAASVQELMLTGNQLETVHGRVF 601 

Qy 599 KGLESLKTLML'RSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLL 658 

M IIIMJMI I II M MIMMMMMMM: MM II lllhlll 
Db 602 RGLSGLKTLMLRSNLIGCVSNDTFAGLSSVRLLSLYDNRITTITPGAFTTLVSLSTINLL 661 

Qy 659 1 ANPFNCNCYLA'WLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCS 718 

:|IMIII:|'IIII:|IM:|I|:IMIMII:IMMIMIIIIIIIII lll:::M 
Db 662 SNPFNCNCHLAWLGKWLRKRRIVSGNPRCQKPFFLKEIPIQDVAIQDFTC-DGNEESSCQ 720 

Qy 719 PLSRCPTECTCLDTWRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLI 778 

Ml :MI::IIIIIIIMI: 1 1 : 1 : 1 : 1 1 II II 1 : 1 1 I 1 1 : II I MMM 
Db 721 LSPRCPEQCTCMETWRCSNKGLRALPRGMPKDVTELYLEGNHLTAVPRELSALRHLTLI 780 
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Qy 779 DLSNNRISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPE 838 

Mill II hi Hill: I lllllllllllll hlhllhhlllllll Ml 
Db 781 DLSNNSISMLTNYTFSNMSHLSTLILSYNRLRCIPVHAFNGLRSLRVLTLHGNDISSVPE 840 

Qy 839 GATNDLSALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIMCAGPGEMADKLLLTTPS 898 

:|IM::|[|!|: 1 1 1 : 1 1 1 : : : 1 1 1 : 1 1 1 : lllllllll: I |||:||||||: 
Db 841 GSFNDLTSLSHLALGTNPLHCDCSLRWLSEWVKAGYKEPGIARCSSPEPMADRLLLTTPT 900 

Qy 899 KKFTCQGPVDVNILMCNPCLSNPCRNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACI 958 

: |:IIII:I|:|IH |||:||||:||| III: III III :||:|| III: I 
Db 901 HRFQCKGPVDINIVAKCNACLSSPCKNNGTCTQDPVELYRCACPYSYKGKDCTVPINTCI 960 

Qy 959 SNPCRHGGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTC 1018 

111:1111111 : :IH I I Nil: II : 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 I 
Db 961 QNPCQHGGTCHLSDSHRDGFSCSCPLGFEGQRCEINPDDCEDNDCENNATCVDGINNYVC 1020 

Qy 1019 LCPPEYTGELCEEKLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQD 1078 

| :lll Mill: :l I :|| |||::||| III |:| III |: |: I III 

|b 1021 ICPPNYTGELCDEVIDHCVPELNLCQHEAKCIPLDKGFSCECVPGYSGRLCETDNDDCVA 1080 

• 1079 NKCRNGAHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINE 1138 
:||::JI I I :lllll 1 1 : 1 : 11 III lllll :lllll ::lllllllll I 
1081 HRCRH^AQCVTJTINGYTCTCPQGFSGPFCEHPPPMVLLQTSPCDQYECQNGAQCIVVQQE 1140 

Qy 1139 PICQCLPGYQGEKCEKLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGD 1198 

I hi II: I :IMI::IM: |:||::: lllllll 1 1 : 1 1 : 1 1 1 : 1 : 1 1 1 1 1 1 1 1 

Db 1141 PTCRCPPGFAGPRCERLITVNFVGRDSYYELASARVRPQANISLQVATDKDNGILLYKGD 1200 

Qy 1199 KDHIAVELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPR 1258 

I :|:MI:| II lh I I : :IHII:|II II III: hhhl II I II 
Db 1201 NDPLALELYQGHVRLVYDSLSSPPTTVYSVETVNDGQFHSVELVTLNQTLNLWDRGTPK 1260 

Qy 1259 IITNLSKQSTLNFDSPLYVGGMPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQK 1318 

: I II : :lllhll:| : :::||| : lllll : Ihllllh 
Db 1261 SLGKLQKQPAVGINSPLYLGGIPTSTGLSALRQGTDRPLGGFHGCIHEVRINNELQDFRA 1320 

Qy 1319 VPMQT-GILPGCEPCHKKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRT1TOPCLGNRCV 1377 

:| I: I: III: I II II I: : III: II Mil lllll::| 
Db 1321 LPPQSLGVSPGCKSC--TVCRHGLCRSVEKDSWCECRPGWTGPLCDQEARDPCLGHRCH 1378 

Qy 1378 HGTCLPINAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGRCRLSGLGQPYCECSSG 1437 

II I: II III Ihll III : I lllll MM M MM I I 

Db 1379 HGKCVATGT ■ SYMCKCAEG YGGDLCDNRNDSANACSAFKCHHGQCHISDQGEPYCLCQPG 1437 

Qy 1438 YTGDSCDREISCRGERIRDYYQRQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRK 1497 

::|: I M I |: :|: ::|:|||:| | || :|||||| | Ml I MM 
Db 1438 FSGEHCQQENPCLGQWREVIRRQRGYASCATASKVPIMECRGGC-GPQCCQPTRSRRRK 1496 



1498 YSFECTDGSSFVDEVERWKCGCTRC 1523 

I MIMMMMII: ::||| I 
1497 YVFQCTDGSSFVEEVERHLECGCLAC 1522 



Q9WVB4 



7 



PRELIMINARY; 



PRT; 1523 AA, 



, 12, Created) 

, 12, Last sequence update) 

. 15, Last annotation update) 



Q9WVB4 
Q9WVB4; 

01-NOV-1999 (TrEMBLrel. 
01-NOV-1999 (TrEMBLrel , 
01-OCT-2000 (TrEMBLrel. 
SLIT3 (FRAGMENT) . 
%LIT3. 

Mus musculus (Mouse). 

Eukaryota; MetazdffJ Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
NCBI TaxID-10090; 
[1] 

SEQUENCE FROM N.A. 
STRAIN-SWISS WEBSTER/ICR; 

Yuan W., Zhou L., Chen J.-H., Wu J.Y., Rao Y., Ornitz D.M.; 
"The mouse SLIT family: Secreted ligands for Robo expressed in 
patterns that suggest a role in morphogenesis and axon guidance."; 



Dev. Biol. 0:0-0(1999). 
EMBL; AF144629,MAAD44760.1; -. 
HSSP; P01132; lEGF. 
MGD; MGI: 1315202; Slit3. 
INTERPRO; IPR000152; 
INTERPRO; IPR0Q0359; 
INTERPRO; IPR006372; ■ 
INTERPRO; IPR000483; 
INTERPRO; IPR0QQ561; 
INTERPRO; IPR000742; 
INTERPRO; IPR001010; 
INTERPRO; IPR001438; 
INTERPRO; IPR001611; 
INTERPRO; IPR001791; 
INTERPRO; IPR003.881; 
INTERPRO; IPR002049; 
INTERPRO; IPR002272; 
INTERPRO; IPR002400; 
PFAM; PF00008; EGF; 9. 
PFAM; PF00054; laminin_G; 1. 
PFAM; PF00560; LRR; 19. 
PFAM; PF01462; LRRNT; 4. 
PFAM; PF01463; LRRCT; 4. 
PRINTS; PR00010; EGFBLOOD . 
PRINTS; PR000H; EGFLAMININ, 
PRINTS; PR00019; LEORICHRPT. 
PRINTS; PR00287;" THIONIN. 
PRINTS; PR00438; GFCYSKNOT. 
PRINTS; PR01143; FSHRECEPTOR. 
PROSITE; PS0001Q; ASXJYDROXYL; ONKNOWNJ. 
PROSITE; PS00022; EGF.l; UNRNOWN.9. 
PROSITE; PS01185; CTCR.1; [JNRNOWN.1. 
PROSITE; PS01186; EGF_2; 7. 
PROSITE; PS01187; EGF.CA; 1. 
PROSITE; PS01225; CTCR.2; 1. 
Glycoprotein; EGF-like domain. 
NONJER 1523" 1523 

SEQUENCE 1523 AA; 167710 MW; F43A3F3E016C4BFC CRC64; 



Query Match • 68,6%; Score 5703; DB 11; Length 1523; 

Best Local Similarity 66.8%; Pred. No. 0; 

Matches 1014; Conservative 222; Mismatches 267; Indels 14; Gaps 

Qy 11 LSLGLVLA-ILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITR 69 

1:111 II II: III MM ::|||||| MM Nil lllll: lllll 
Db 16 LALGLALASILSGPPAAACPTKCTCSAASVDCHGLGLRAVPRGIPRNAERLDLDRNNITR 75 

Qy 70 ITKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPELLFLGTARL 129 

III IIMMMIM I :|::| 1 1 1 1 1 1 1 1 1 1 :| 1 1 1 1 1 1 : 1 lh lllll I II 
Db 76 ITRMDFAGLRNLRVLHLEDNQVSIIERGAFQDLKQLERLRLNRNRLQVLPELLFQSTPRL 135 

Qy 130 YRLDLSENQIQAIPRRAFRGAVDIRNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITR 189 

MIIMMII lllllll : I ! 1 1 1 1 I lllllllllllllllhlllllllhl 

Db 136 TRLDLSENQIQGIPRKAFRGVTGVRNLQLDNNHISCIEDGAFRALRDLEILTLNNNNISR 195 

Qy 190 LSVASFNHMPRLRTFRLHSNNLYCDCHLAWLSDWLRRRPRVGLYTQCMGPSHLRGHNVAE 249 

: I I [ 1 1 1 1 1 : M llllhllllllllllllllhl M ;| II I MM MM 
Db 196 ILVTSFNHMPKIRTLRLHSNHLYCDCHLAWLSDWLRQRRTIGQFTLCMAPVHLRGFSVAD 255 

Qy 250 VQRREFVCSDEEEGHQSFMAPSCSV-LHCPAACTCSNNIVDCRGKGLTEIPTNLPETIT 307 

111:1:11 . I I MM I I h 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mil I 
Db 256 VQRREYVC---PGPHS-EAPACNANSLSCPSACSCSNNIVDCRGKGLTEIPANLPEGIV 310 

Qy 308 EIRLEQNTIRVIPPGAFSPYKRLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNKITEL 367 

MMMMM, II III: lllhllhl 1 1 1 1 : : 1 1 1 1 1 1 1 1 : 1 ! MMMIIMM 
Db 311 EIRLEQNSIRSIPAGAFTQYRRLKRIDISRNQISDIAPDAFQGLRSLTSLVLYGNRITEI 370 

Qy 368 PRSLFEGLFSLQLLLLNANRINCLRVDAFQDLHNLNLLSLYDNKLQT I AKGTFS PLRAIQ 427 

II Ihll llllllllllllllllh MM M 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 I MM 

Db 371 PRGLFDGLVSLQLLLLNANRINCLRVNTFQDLQNLNLLSLYDNXLQTISRGLFVPLQSIQ 430 
Qy 428 TMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANRRIGQIKSKRFRCSGTED 487 
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MMMIMMIIIIIIIMI IIIIMIIIMIMMMII 1 1 1 1 M I ! 1 1 1 : 1 1 
Db 431 TLHLAQNPFVCDCHLKWLADYLQDNPIETSGARCSSPRRLANKRISQIRSRKFRCSGSED 490 

Qy 488 YRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVLEAT 547 

II" I :ll II 1 1 1 1 1 1 ! I M IMIIIII HI hhll H 1 1 1 : 1 : :IHII 
Db 491 YRNBFS SECFMDLVCPEKCRCEGT I VDCSNQKLARI PSHLPEY T TDLRLNDND I SVLEAT 550 

Qy 548 GIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVOBKMFKGLESLKTL 607 

lllllll INN II: :: llhlh | |::|| |:|| : :||:| Mil 
Db 551 GIFRKLPNLRKINLSNNRIKETOEGAFDGAASVQELMLTGNQLETMHGRMFRGLSGLKTL 610 

Qy 608 MLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCY 667 

inn mi im nmiiiimmii: mm ii immimmih 

Db 611 MLRSNLISCVSNDTFAGLSSVRLLSLYDNRITTITPGAFTTLVSLSTINLLSNPFNCNCH 670 

Qy 668 LAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCPIEC 727 
:!l!l I M I : M 1 : 1 1 1 1 M ! I : E 1 1 1 1 1 ! M 11 1 1 1 1 1 1 lll:::|l III : 

•671 MAWLGRWLRKRRIVSGNPRCQKPFFLREIPIQDVAIQDFTC-DGNEESSCQLSPRCPEQF 729 
728 TCLDTWRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYRHLTLIDLSNNRIST 787 

mmimmi mimmmiiiimii i iiiiii iiiiimi n 

Db 730 TCVETWRCSNRGLHALPKGMPKDVTELYLEGNHLTAVPKELSAFRQLTLIDLSNNSISM 789 
Qy 788 LSNQSPSNMTQLLTLILSYMRLRCIPPRTPDGLKSLRLLSLHGNDISWPEGAFNDLSAL 847 

m mm i iimiimiii immmmimi inimii::i 

Db 790 LTNHTFSNMSHHTLILSYNRLRCIPVHAFNGLRSLRVLTLHGNDISSVPEGSFNDLTSi 849 
Qy 848 SHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSRKFTCQGPV 907 

mi:i iimmmiimi: iiiiimm i nmmii: m mil 

Db 850 SHLALGTNPLHCDCSLRWLSEWVRAGYREPGIARCSSPESMADRLLLTTPTHRFQCKGPV 909 

Qy 908 DVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFRGQDCDVPIHACISNPCRHGGT 967 

1:11:1111 1 1 hi II I :| I h III: lllllll MIMI III: |: MMM 
Db 910 DINIVAKCNACLSSPCKNNGTCSQDPVEQYRCTCPYSYKGKDCTVPINTCVQNPCEHGGT 969 

Qy 968 CHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEYTGE 1027 

III I III I I llll: IM I I M 1 1 1 1 1 1 1 1 1 1 1 1 1 MM MM 
Db 970 CHLSENLRDGFSCSCPLGFEGQRCEINPDGCEDNDCENSATCVDGINNYACLCPPNYTGE 1029 

Qy 1028 LCEEKLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCRNGAHC 1087 

11:1 MM ::| 1 1 1 :: 1 1 1 MMM III I: I: : III MMM I 
Db 1030 LCDEVIDYCVPEMNLCQHEARCISLDKGFRCECVPGYSGRIiCETNNDDCVAHKCRHGAQC 1089 

Qy 1088 TDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCLPGY 1147 

i i 1 1 1 1 1 1 1 1 : i : 1 1 1 1 1 1 inn mm milium n hi m 

• 1090 VDEVNGYTCICPQGFSGLFCEHPPPMVLLQTSPCDQYECQNGAQCIWQQEPTCRCPPGF 1149 
1148 QGEKCEKLVSVNFINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELY 1207 

I : 1 1 1 1 : : 1 1 1 : MM:: HUM 11:11:111:1:11111111 I MMM 

Db 1150 AGPRCEKLITVNFVGKDSYVEIASARVRPQANISLQVATDKDNGILLYKGDNDPLALELY 1209 

Qy 1208 RGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLSKQS 1257 

M II ||:||: :||||imi II hi: |:|:|:| II I II : III 
Db 1210 QGHVRLVYDSLSSPPTTVYSVETVNDGQFHSVKLVMLNQTLNLWDRGAPKSLGKLQRQP 1269 

Qy 1268 TLNFDSPLYVGGMPGRSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQT-GIL 1326 

: :|limi:| : :::||| : MM : l|:|||||: :| |: |: 
Db 1270 AVGSNSPLYLGGIPTSTGLSALRQGADRPLGGFHGCIHEVRINNELQDFKALPPQSLGVS 1329 

Qy 1327 PGCEPCHRRVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRINDPCLGNKCVHGTCLPINA 1386 

III: I II II I: : III II IIIIII IMIM I MM 
Db 1330 PGCKSC--TVCRHGLCRSVEKDSWCECHPGWIGPLCDQEARDPCLGHSCRHGTCM-ATG 1386 

Qy 1387 FSYSCKCLEGHGGVLCDEEEDLFNPCQAIKCKHGRCRLSGLGQPYCECSSGYTGDSCDRE 1446. 

II III 11:11 III:: I :|lll MM :| hill I MM MM 

Db 1387 DSYVCRCAEGYGGALCDQKNDSASACSAFRCHHGQCHISDRGEPYCLCQPGFSGHHCEQE 1446 

Qy 1447 ISCRGERIRDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRRYSFECTDGS 1506 

I II :h ::h MM I II : It I M I I III hlllllll MMM 
Db 1447 NPCHGEIVREAIRRQKDYASCATASKVPIMECRGGC-GSQCCQPIRSKRRKYVFQCTDGS 1505 

Qy 1507 SFVDEVEKWKCGCTRC 1523 

MMMM ::||| I 



Db 1506 SFVEEVERHLECGCRAC 1522 



RESULT 8 
075093 

ID 075093 PRELIMINARY; PRI; 1534 AA. 

AC 075093; 

DT 01-NOV-1998 (TrEMBLrel. 08, Created) 

DT 01-AUG-1999 (TrEMBLrel, 11, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE SLIM PROTEIN. 

GN SLIM. 

OS Homo sapiens (Human), 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID-9606; ' 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-99033071; PubMed-9813312; 

RA Itoh A., Miyabayashi T., Ohno M., Sakano S.; 

RT "Cloning and expressions of three mammalian homologues of Drosophila 

RT slit suggest possible roles for Slit in the formation and maintenance 

RT of the nervous system."; 

RL Brain Res. Mol,. Brain Res. 62:175-186(1998). 

DR EMBL; AB017167; 'BAA35184 .1; -. 

DR HSSP; P00743; 1CCF. 

DR INTERPRO; IPR000152; 

DR INTERPRO; IPR000359; 

DR INTERPRO; IPR000372; 

DR INTERPRO; IPR000483; 

DR INTERPRO; IPR000561; 

DR INTERPRO; IPR000742; 

DR INTERPRO; IPR001438; 

DR INTERPRO; IPR001611; 

DR INTERPRO; IPR001791; 

DR INTERPRO; IPR001881; 

DR INTERPRO; IPR002049; 

DR PFAM; PF00008; EGF; 9. 

DR PFAM; PF00054; laminin_G; 1. 

DR PFAM; PF00560; LRR; 1! 

DR PFAM; PF01462; LRRNT; 4. 

DR PFAM; PF01463; LRRCT; 4. 

DR PRINTS; PR00010; EGFBLOOD. 

DR PRINTS; PR00011; EGFLAMININ. 

DR PROSITE; PS00010; ASXJYDROXYL; UNKNOWN 2. 

DR PROSITE; PS00022; EGF.1; UNRNOWNJ. 

DR PROSITE; PS01185; CTCR_1; ONKNOWNJ. 

DR PROSITE; PS01185; EGF 2; 8. 

DR PROSITE; PS01187; EGF.CA; 2, 

DR PROSITE; PS01225; CTCK.2; 2. 

KW Glycoprotein; EGF-like domain. 

"" 1534. AA; 167951 MW; 8954EF8EA4DAEBA1 CRC64; 



Query Match ■* 67.3%; Score 5597; DB 4; Length 1534; 
Best Local Similarity 65.3%; Pred. No. 0; 



llll I : i : 1 : 1 1 1 1 1 1 I : : : 1 : 1 1 1 1 1 1 1 1 1 :|| I III III I I 



lllh MIMMMM Mill 1 : 1 1 1 1 1 1 1 1 1 1 1 I : III I llll 



Matches 


Qy 


15 


Db 


21 


Qy 


75 


Db 


81 


Qy 


135, 


Db 


141 


Qy 


195 


Db 


201 



in miimii 



mm hiimmiim minium : 



1 1 1 1 1 1 1 ii 1 1 1 1 n : i : 1 1 1 1 n 1 1 1 mm mmi m in mini i 
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Qy 255 FVCSDEEEGHQSFMAPSCSVL—HCPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLE 312 

111:1 : 1:1:: 111 MM llllllllll II llllhllllll 
Db 261 FSCSGQGEAGR- - -VPTCTLSSGSCPAMCTCSNGIVDCRGKGLTAIPANLPETMTEIRLE 317 

Qy 313 QNTIKVIPPGAFSPYKRLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNRITELPRSLF 372 

I II lllllllll:llllllllllll:|:ll!lllllllllllllllllll:l|: :| 
Db 318 LNGIKSIPPGAFSPYRKLRRIDLSNNQIAEIAPDAFQGLRSLNSLVLYGNKITDLPRGVF 377 

Qy 373 EGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLA 432 

ll::||IMIIIIIII|:| llllll 1 1 :| III I II hi ::l III h 1 1 1 1 1 1 : 1 ! I 
Db 378 GGLYTLQLLLLNANKINCIRPDAFQDLQNLSLLSLYDNKIOSLAKGTFTSLRAIQTLHLA 437 

Qy 433 QNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKRFRCS G 484 

IMIIIIMMIIM lllllllllll IIIIIIIIIIMIIIIIIIII I 
Db 438 QNPFICDCNLKWLADFLRTNPIETSGARCASPRRLANKRIGQIKSKKFRCSAKEQYFIPG 497 

Qy 485 TEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNRIPEHIPQYTAELRLNNNEFTVL 544 

Mil :h :l :|: II lllll llll III llllllllll ::| 

Db 498 TEDY--QLNSECNSDWCPHKCRCEANWECSSDKLTKIPERIPOSTAELRLNNNEISIL 555 



545 EATGIFKKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKGLESL 604 

1111:111 1:111 llll:::||:|||ll|: hi: ||:|:||::: 1 1 : 1 1 : 
556 EATGMFKKLTHLKKINLSNNKVSEIEDGAFEGAASVSELHLTANQLESIRSGMFRGLDGL 615 

605 KTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNC 664 

:||||'|:|||:|: llll II : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
616 RTLMLRNNRISCIHNDSFTGLRNVRLLSLYDNQITTVSPGAFDTLQSLSTLNLLANPFNC 675 

I 

665 NCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCP 724 

II lllll lllT::||||||||| I hhhlll I |::| :: I | :|| 
676 NCQLAWLGGWLRKRKIVTGNPRCQNPDFLRQIPLQDVAFPDFRCEEGQEEGGCLPRPQCP 735 



Qy 725 TECTCLDTWRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLIDLSNNR 784 

II lllllllllll I: mihhlllllllllhlh :|| :|:| hlllll: 
Db 736 QECACIJ)TWRCSNKHLRALPKGIPKNVTELYLDGNQFTLVPGQLSIFKYLQLVDLSNNK 795 



I 



785 ISTLSJiQSFSNMTQLLTLILSYNRLRCIPPRTFDGLKSLRLLSLHGNDISWPEGAFNDL 844 

lhll| 11:11:11 1 1 ! M 1 1 hill I Ihl III Ml III II : II I I 
796 ISSLSNSSFTNMSQLTTLILSYNALQCIPPLAFQGLRSLRLLSLHGNDISTLQEGIFADV 855 

845 SALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQ 904 

::||llllllllllll|:::||| III: llllllllll :| 1 1 ! 1 1 1 1 : 1 1 1 
656 TSLSHLAIGANPLYCDCHLRWLSSWVKTGYKEPGIARCAGPQDMEGRLLLTTPAKKFECQ 915 

905 GPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQXDVPIHACISNPCKH 964 

II : : III: llhlhl llhhh: III II hlhlhl :::| I I:: 
916 GPPTUVQAKCDLCLSSPCQNQGTCHNDPLEVYRCACPSGYKGRDCEVSLNSCSSGPCEN 975 

965 GGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSICVDGINNYTCLCPPEY 1024 

lllll :hh I I I UN I || III I: I I llll: II! I :| 
976 GGTCHAQEGEDAPFTCSCPTGFEGPTCGVNTDDCVDHACANGGVCVDGVGNYTCQCPLQY 1035 



Qy 1025 TGELCEERLDFCAQDLNPCQHDSRCILTPRGFRCDCTPGYVGEHCDIDFDDCQDNRCKNG 1084 

I: II: :| I: lllllll:::|: II I :|:| III |::| : I II : I : : 1 : 1 1 
Db 1036 EGRACEQLVDLCSPDLNPCQHEAQCVGTPDGPRCECMPGYAGDNCSENQDDCRDHRCQNG 1095 

Qy 1085 AHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCL 1144 

I I I II hhl lllll II I : I: III: hllll h : I hill! 
Db 1096 AQCMDEVNSYSCLCAEGYSGQLCEIPPHLPAPK-SPCEGTECQNGANCVDQGNRPVCQCL 1154 

Qy 1145 PGYQGERCERLVSVNFINRESYLQIPSARVRPQTNITLQIATDEDSGILLYKGDRDHIAV 1204 

II: I :|lll:|lll:::::!ll : |: llllh: hlllll II lllll 
Db 1155 PGFGGPECEKLLSVNFVDRDTYLQFTDLQNWPRANITLQVSTAEDNGILLYNGDNDHIAV 1214 

Qy 1205 ELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLS 1264 

llhl II III Ihhlllll llllll II Ihl II ::<ll:|||:| : I 
Db 1215 ELYQGHVRVSYDPGSYPSSAIYSAETIHDGQFHTVELVAFDQMVNLSIDGGSPMTMDNFG 1274 

Qy 1265 KQSTLNFDS PLYVGGMPGKSNVASLRQAPGQNGTSFHGC IRNLY INS ELQDFQKVPMQTG 1324 

I III ::|lllllll I I: I III 11111111111:11111 I h I 
Db 1275 KHYTLNSEAPLYVGGMPVDVNSAAFRLWQILNGTGFHGCIRNLYINNELQDFTKTQMRPG 1334 



Qy 1325 ILPGCEPCHRKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNRCVHGTCLPI 1384 

-llllll I: I II III:: I I h Ih III : I hlllll hh 

Db 1335 WPGCEPCRKLYCLHGICQPNATPGPMCHCEAGWVGLHCDQPADGPCHGHKCVHGQCVPL 1394 

Qy 1385 NAFSYSCKCLEGHGGVLCDEEEDLFNPCQAIRCRHGRCRLSGLGQPYCECSSGYTGDSCD 1444 

:| hh :h I II:: I Ih :;| II h I :| I hh: |: 

Db 1395 DALSYSCQCQDGYSGALCNQAGALAEPCRGLQCLHGHCQASGTKGAHCVCDPGFSGELCE 1454 

Qy 1445 REISCRGERIRDYYQRQQGYAACQTTRRVSRLECRGGCAGGQCCGPLRSRRRRYSFECTD 1504 

:| III: :ll::l hill lllh :l hill I I II II 1 1 1 1 : : 1 1 1 : 1 

Db 1455 QESECRGDPVRDFHQVQRGYAICQTTRPLSWVECRGSCPGQGCCQGLRLRRRRFTFECSD 1514 



1505 GSSFVDEVERWRCGCTRC 1523 

hh :!lih llll I 
1515 GTSFAEEVERPT RCGCALC 1533 



RESULT 9 
088279 
ID 
AC 
DI 
DT 
DT 



088279 PRELIMINARY; PRT; 1531 AA, 
088279; ; 

01-NOV-1998 (TrEMBLrel. 08, Created) 
01-NOV-1998 (TrEMBLrel. 08, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
MEGF4. 
MEGF4 . 

Rattus norvegicus (Rat) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 
NCBI_TaxID-10116; 
[11 

SEQUENCE FROM NiA. 

STRAIN-SPRAGDE-DAWLEY; TISSUE-BRAIN; 
MEDLINE-98360089; PubMed-9693030; 

Nakayama M., Nakajiraa D., Nagase T., Nomura N., Seki N., Ohara 0.; 
" Identif ication/of high-molecular-weight proteins with multiple EGF- 
like motifs by motif -trap screening."; 
Genomics 51:27-34(1998). 
EMBL; AB011530; ;BAA32460.1; -. 



HSSP; P00743; 1CCF. 


INTERPRO 


IPR000152; 


INTERPRO 


IPR000359; 


INTERPRO 


IPR000372; 


INTERPRO 


IPR000483; 


INTERPRO 


IPR000561; 


INTERPRO 


IPR000742; 


INTERPRO 


IPR001438; 


INTERPRO 


IPR001611; 


INTERPRO 


IPR001791; 


INTERPRO 


IPR001881; 


PFAM; PF 


0008; EGF; 



PFAM; PF00054; lamlnin_G; 1. 
PFAM; PF00560; LRR; 19. 
PFAM; PF01462; LRRNT; 4. 
PFAM; PF01463; LRRCT; 4. 
PRINTS; PR00010; EGFBLOOD, 
PROSITE; PS00010; ASXJYDROXYL; UNRNOWN.2. 
PROSITE; PS00022; EGF.1; UNRNOWNJ. 
PROSITE; PS01185; CTCR.1; UNRNOWNJ. 
PROSITE; PS01186; EGF J; 8. 
PROSITE; PS01187; EGF.CA; 2. 
PROSITE; PS01225; CTCRJ; 1. 
Glycoprotein; EGF-llke domain. 

1531 AA; 167497 MW; DFC4B60CCBC5529A CRC64; 



Query Match , 67.1%; Score 5578.5; DB 11; Length 1531; 
Best Local Similarity 65.0%; Pred. No. 0; 

Matches 988; Conservative 230; Mismatches 282; Indels 19; Gaps 6; 

15 LVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITRITRTD 74 

I: I :: : llll hhhllllll I ::: I : I II 1 1 1 1 1 1 : 1 1 1 II I II I II 
21 LLWAAAWRLGATACPALCTCTGTTVDCHGTGLQAIPRNIPRNTERLELNGNNITRIHKND 80 



Qy 
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Qy 75 FAGLRHLRVLQLMENRISTIERGAFQDLRELERLRLNRNHLQLFPEILFLGTARLYRLDL 134 

lllh IIIIMM ||: ||||| | III 

Db 81 FAGLKQLRVLQLMENQIGAVERGAFDDMKELERLRLNRNQLQVLPELLFQNNQALSRLDL 140 

Qy 135 SENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVAS 194 

III HIHIIIIIII Mllll llllllhlllllll IIIIIIIMIII : |:| 
Db 141 SENSLQAVPRKAFRGATDLKNLQLDKNQISCIEEGAFRALRGLEVLTLNNNNITTIPVSS 200 

Qy 195 FNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPSHLRGHNVAEVQKRE 254 

llllllllllllllhhlllllllll 111:11 MM! ||: III ||||||| I 
Db 201 FNHMPKLRTFRLHSNHLFCDCHLAWLSQWLRQRPTIGLFTQCSGPASLRGLNVAEVQKSE 260 

Qy 255 FVCSDEEEGHQSFMAPSCSVL--HCPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLE 312 

I II M I hi:: Ml |:||| llllllllll II llllhllllll 
Db 261 FSCSGQGEAAQ— VPACTLSSGSCPAMCSCSNGIVDCRGKGLTAIPANLPETMTEIRLE 317 



Db 



# 

Qy 

Db 



313 QNTIKVIPPGAFSPYRRLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNRITELPRSLF 372 

I II lllllllll:IIIIIIIIMII:|:|||||||||||||||lllllll:||: :l 

318 LNGIRSIPPGAFSPYRRLRRIDLSNNQIAEIAPDAFQGLRSLNSLVLYGNKITDLPRGVF 377 

373 EGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLA 432 

IhMIIIIIIIIIIIIM MM 1 : 1 1 1 1 1 1 1 1 : 1 :: 1 1 1 1 1 ; ||||||:||| 

378 GGLYTLQLLLLNANKIKCIRPDAPQDLQNLSLLSLYDNKIQSLAKGTFTSLRAIQTLHIiA 437 



Qy 433 QNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIRSRRFRCS G 484 

lllllll|:|IMII:| IIIIMM lllllllllllllllllllll I 
Db 438 QHPFICDCNLKWLADFLRTNPIETTGARCASPRRIAHKRIGQIKSKRFRCSAKEQYFIPG 497 

Qy 485 TEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNXIPEHIPQYTAELRLNNNEFTVL 544 

UN I: :l Mil INN : hi! Mill I I HUM ::| 
Db 1 498 TEDYH--LNSECTSDVACPHRCRCEASWECSGLRLSKIPERIPQSTTELRLNNNEISIL 555 

r 

Qy ' 545 EATGIFRRLPQLRRINFSNNRITDIEEGAFEGASGVNEILLTSNRLENVQHRMFRGLESL 604 

IIIIMIII hill llll:::||:| lllh I'M !M|M: ||:||: 
Db 556 EATGLFRRLSHLRRINLSNNRVSEIEDGTFEGATSVSELHLTANQLESVRSGMFRGLDGL 615 

Qy 605 KTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNC 664 

HMIhllhh MM II : 1 1 1 1 1 1 1 1 1 llhHIIIIII MIMMIIIMM 
Db 616 RTLMLRNNRISCIHNDSFTGLRNVRLLSLYDNHITTISPGAFDTLQALSTLNLLANPFNC 675 

Qy 665 NCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCP 724 

II 1 1 1 1 1 : 1 1 1 ! : : 1 1 1 1 1 1 1 f I I IMMIMIM II MM :: I I Ml 
Db 676 NCQLAWLGDWLRRRKIVTGNPRCQNPDFLRQIPLQDVAFPDFRCEEGQEEVGCLPRPQCP 735 

725 TECTCLDTWRCSNRGLRVLPRGIPRDVTELYLDGNQFTLVPRELSNYRHLTLIDLSNNR 784 

II lllllllllll I: IIIMhMIIIIIIIIIIIIII Ml MM MINI: 

736 QECACLDTWRCSNKHLQALPKGIPKNVTELYLDGNQFTLVPGQLSTFKYLQLVDLSNNK 795 

785 ISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLRSLRLLSLHGNDISVVPEGAFNDL 844 

Ml IIMIMI lllllll IMIII I 1 1 : 1 1 1 1 1 ! 1 1 J 1 1 : 1 Mill: 
796 ISSLSHSSFTNHSQLTTLILSYNALQCIPPLAFQGLRSLRLLSLHGNDVSTLQEGIFADV 855 

Qy 845 SALSHLAIGAHPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSRKFTCQ 904 

::|||||||||IHIII::MII III: llllllllllll II MM 

Db 856 TSLSHLAIGAHPLYCDCHLRWLSSWVKTGYKEPGIARCAGPPEMEGKLLLTTPAKKFECQ 915 

Qy 905 GPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCRH 964 

II : : IIIMIIIMIM MMI:: MM I : ! I : : I : I : M Mil : 
Db 916 GPPSLAVQAKCDPCLSSPCQNQGTCHNDPLEVYRCTCPSGYRGRNCEVSLDSCSSNPCGN 975 

Qy 965 GGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEY 1024 

Hill Mlh II I I Mil I M III Ml I Hill Mil II M 

Db- 976 GGTCHAQEGEDAGFTCSCPSGFEGLTCGMNTDDCVRHDCVNGGVCVDGIGNYTCQCPLQY 1035 

Qy 1025 TGELCEERLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNRCRNG 1084 

1 II II: Mil: llllll|:::|: IIM MM III hM : 1 1 1 : 1 : : 1 : 1 1 
Db 1036 TGRACEQLVDFQSPDLNPCQHEAQCVGTPEGPRCECVPGYTGDNCSKNQDDCKDHQCQNG 1095 

Qy 1085 AHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCL 1144 

I I I M I IM Hill III III I: Mllll I: : : IMIII 
Db 1096 AQCVDEINSYACLCAEGYSGQLCEIPP- • -APRNS-CEGTECQNGANCVDQGSRPVCQCL 1151 



Qy 1145 PGYQGERCERLVSVNFIKRESYLQIPSAKVRPQTNIILQIAIDEDSGILLYKGDKDHIAV 1204 

II: I : I M I M I M : : : : M I [ : |: lliM:| IMIII || Ml 
Db 1152 PGFGGPECERLLSVNFVDRDTYLQFTDLQNWPRANITLQVSTAEDNGILLYNGDNDHIAV 1211 

Qy 1205 ELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPRIITNLS 1264 

IIIM.IMII I M 1 : 1 1 1 1 1 llllll || III: II MMMH : I 
Db 1212 ELYQGHVRVSYDPGSYPSSAIYSAETINDGQFHTVELVTFDQMVNLSIDGGSPMTMDNFG 1271 

Qy 1265 RQSTLNFDSPLYVGGMPGRSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQRVPMQTG 1324 

I III MJIIMI | |: | 11 1 Mill III 1 1 II :! I IM I I: I 
Db 1272 RHYTLNSEAPLYVGGMPVDVNSAAFRLWQILNGTSFHGCIRNLYINNELQDFTRTQMRPG 1331 

Qy 1325 ILPGCEPCHRKVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNKCVHGTCLPI 1384 

:MIIIII l: : I II III:: I MMI III : II IMIII! IM 
Db 1332 WPGCEPCRRLYCLHGICQPNATPGPVCHCEAGWGGLHCDQPVDGPCHGHRCVHGRCVPL 1391 

Qy 1385 NAFSYSCRCLEGHGGVLCDEEEDLFNPCQAIRCKHGRCRLSGLGQPYCECSSGYTGDSCD 1444 

M MUM M: I II:: : II :M II |: I M II |::|: |: 
Db 1392 DALAYSCQCQDGYSGALCNQVGAVAEPCGGLQCLHGHCQASATRGAHCVCSPGFSGELCE 1451 

Qy 1445 REISCRGERIRDYYQRQQGYAACQTTRRVSRLECRGGCAGGQCCGPLRSKRRRYSFECTD 1504 

M III: Ml::: IMII lllh M Mill I I II I! Illl MUM 
Db 1452 QESECRGDPVRDFHRVQRGYAICQTTRPLSWVECRGACPGQGCCQGLRLKRRRLTFECSD 1511 

Qy 1505 GSSFVDEVERWKCGCTRC 1523 

imi mil- mi i 

Db 1512 GTSFAEEVERPTRCGCAPC 1530 



PRELIMINARY; PRI; 1531 AA. 



RESULT 10 
Q9WVB5 
ID Q9WVB5 
AC Q9WVB5; 

DT 01-NOV-1999 (TrEMBLrel. 12, 
DI 01-NOV-1999 (TrEMBLrel, 12, Last sequence update) 
DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
DE SLIT1. 
GN SLIT1 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chorda ta; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus, 
OX NCBI.TaxID-10090; 
RN [1] 

RP SEQUENCE FROM N'.A. 

RC STRAIN-SWISS WEBSTER/ICR; 

RA Yuan Zhou L., Chen J.-H., Wu J.Y., Rao Y,, Ornitz D,M,; 
RT "The mouse SLIT 'family: Secreted ligands for Robo expressed in 
rt patterns that suggest a role in morphogenesis and axon guidance,"; 
RL Dev. Biol. 0:0-0(1999) . 
DR EMBL; AF144627;. AAD44758.1; ■, 
DR HSSP; P00743; 1CCF. 
DR MGD; MGI:1315203; Slitl. 
DR INTERPRO; IPR000152; 
DR INTERPRO; IPR000359; 
DR INTERPRO; IPR000372; 
DR INTERPRO; IPR000483; 
DR INTERPRO; IPR000561; 
DR INTERPRO; IPR000742; 
DR INTERPRO; IPR001438; 
DR INTERPRO; IPR001611; 
DR INTERPRO; IPR001791; 
DR INTERPRO; IPR001881; 
DR PFAM; PF00008; EGF; S 
DR PFAM; PF00054; laminin_G; 1. 
DR PFAM; PF00560; LRR; 19. 
PFAM; PF01462; LRRNT; 4. 
PFAM; PF01463; LRRCT; 4. 
PRINTS; PR00010; EGFBLOOD, 
PROSITE; PS00010; ASXJYDROXYL; ONRNOWN 2, 
PROSITE; PS00022; EGF_1; UNKNOWNJ. 
PROSITE; PS01185; CTCR.1; UNRNOWNJ. 
PROSITE; PS01186; EGFJ; 8. 
PROSITE; PS0118-7; EGF CA; 2. 
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DR PROSITE; PS01225; CTCR_2; 1. 
KW Glycoprotein; EGF-like domain. 

SO SEQUENCE 1531 AA; 167545 MW; F7D09AA6693A4P30 CRC64; 



Query Match 66.6%; Score 5538.5; DB 11; Length 1531; 

Best Local Similarity 64.6*; Pred, No. 0; 

Matches 982; Conservative 235; Mismatches 284; Indels 19; Gaps 

Qy 15 LVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITRITKTD 74 

I: I ■ llll hhhllllll l:::|:lllllll!l:IHIIIII I I 
Db 21 LLWAAAWRLGATACPALCTCTGTTVDCHGTGLQAIPKNIPRNTERLELNGNNITRIHKND 80 

Qy 75 FAGLRHLRVLQUOKISTIERGAFQDLKELERLRLNRNHLQLFPELLFLGTAKLYRLDL 134 

llll: lllllllll: : M M I 1 : 1 1 1 1 1 1 1 1 1 1 1 11: INI I llll 
Db 81 FAGLKQLRVLQLMENQIGAVERGAFDDMKELERLRLNRNQLQVLPELLFQNNQALSRLDL 140 

Qy 135 SENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVAS 194 

III : 1 1 : M M 1 1 1 1 MIIM! [ : 1 1 1 1 1 : 1 [ 1 1 1 1 1 1 1 ! I M 1 1 1 1 1 1 : |:| 
Db 141 SENFLQAVPRKAFRGATDLKNLQLDKNRISCIEEGAPRALRGLEVLTLNNNNITTIPVSS 200 

% 195 FNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPSHLRGHNVAEVQKRE 254 
W IIIMIIMIIIIII:|:||IIIIIII |||:|| :||:||| ||: III MIIM I 

"Db 201 FNHMPRLRTFRLHSNHLFCOLAWLSQWLRQRPTIGLFTQCSGPASLRGLNVAEVQRGE 260 

Qy 255 FVCSDEEEGHQSFMAPSCSVL-HCPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLE 312 

11:1 : 11:1:: III hll: lllllllll II llllhllllll 
Db 261 FSCSGQGE- - -AAGAPACTLSSGSCPAMCSCSSGIVDCRGKGLTAIPANLPETMTEIRLE 317 

Qy 313 QNTIPIPPGAFSPYRRLRRIDLSNNQISEIAPDAFQGLRSLNSLVLYGNKITELPKSLF 372 

I II MMIIIII:IMIIIIIIII|:|:|IIIIIIIIIIIMIMIIIM:II: :| 

Db 318 LNGIKSIPPGAFSPYRKLRRIDLSNNQIAEIAPDAFQGLRSIjNSLVLYGNKITDLPRGVP 377 

Qy 373 EGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLA 432 

Ihillllllimill:! IIMII 1 1 : 1 1 1 1 1 1 1 1 : 1 :: 1 1 1 1 1 : I i 1 1 ! 1 : 1 1 1 
Db 378 GGLYTLQLLLLNANKINCIRPDAFQDLQNLSLLSLYDNKIQSLAKGIFTSLRAIQTLHLA 437 

Qy 433 QNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKRFRCS G 484 

IMIMIMIIIIIM lllllhllll MIIIIIMIIIIMIMIM I 
Db 438 QNPFICDCNLKWLADFLRTNPIETTGARCASPRRLANKRIGQIKSKKFRCSAKEQYFIPG 497 

Qy 485 TEDYRSKLSGDCPADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVL 544 

MM I: :l :|:HI Mil : |:|l: IMII III I Hllllll ::| 
Db 498 TEDYH--LNSECTSDVACPHKCRCEASWECSSLKLSKIPERIPQSTTELRLNNNEISIL 555 

Qy 545 EATGIFRKLPQLRKINFSNNKITD1EEGAFEGASGVNEILLTSNRLENVQHKMFKGLESL 604 

1111:1111 hill llll:::||:| llll: hi: ll:|:||::: 1 1 : 1 1 : I 
Db 556 EATGLFRRLSHLRRINLSNMVSEIEDGTFEGAASVSELHLTANQLESIRSGMFRGLDGL 615 

K605 KTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNC 664 
:||lll:|i|:|: llll II lllllllll lllllllll : 1 1 1 1 1 1 1 1 1 1 1 1 1 
616 RTLMLRNNRISCIHNDSFTGLRNVRLLSLYDHHITTISPGAFDILQA1STLNLLANPFNC 675 

Qy 665 NCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCP 724 

l|:|:||l:MII::IHIIIII I ||::||:|||| II l::l :: I I :|| 
Db 676 NCHLSWLGDWLRRRKIVTGNPRCQNPDFLRQIPLQDVAFPDFRCEEGQEEVGCLPRPQCP 735 

Qy 725 TECTCLDTWRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLIDLSNNR 784 

II lllllllllll I: IMMMMMIMIMIIIII :|| :|:| |:||ll|: 

Db 736 QECACLDTVVRCSNRHLQALPRGIPRNVTELYLDGNQFTLVPGQLSTFRYLQLVDLSNNK 795 

Qy 785 ISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLRSLRLLSLHGNDISVVPEGAFNDL 844 

11:111 lhll:ll MINI |:|||| I 1 : 1 1 1 1 1 1 II 1 1 1 : 1 : II I I: 
Db 796 ISSLSNSSFTNMSQLTTLILSYNALQCIPPLAFQRLRSLRLLSLHGNDVSTLQEGIFADV 855 

Qy 845 SALSHLAIGANPLYCDCNMQWLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQ 904 

::|MIIIIIIIIIII| ::||| II: MIIIMM! II IMIIIIMM II 
Db 856 TSLSHLAIGANPLYCDCRLRWLSSWVKTGYREPGIARCAGPPEMEGKLLLTTPAKKFECQ 915 

Qy 905 GPVDVNILAKCNPCLSNPCKNDGTCNSDPVDFYRC1CPYGFRGQDCDVPIHACISNPCKH 964 

II : : llj:|lll:||:| lll::||:: Mill |:||: Ml : I llll : 
Db 916 GPPSLAVQAKCDPCLSSPCQNQGTCHNDPLEVYRCTCPSGYKGRHCEVSLDGCSSNPCGN 975 



Qy 965 GGTCHLREGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEY 1024 

lllll MIM || | | Hll II: III : I I MIM llll II M 
Db 976 GGTCHAQEGEDAGFTCSCPSGFEGPTCGVDTDDCVKHACVNGGVCVDGVGNYTCQCPLQY 1035 

Qy 1025 TGELCEERLDFCAQDLNPCQHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNG 1084 

II II: MM: Mllll:::|: II I MM II : |||:|:||:|| 

Db 1036 TG RACEQLVDFCS PDMNPCQHEAQCVGTPDG PRC ECMLG YTGDNC SENQDDC KDHKCQNG 1095 

Qy 1085 AHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCL 1144 

lllll IMM lllll || I MM I: MIMI |: : : |:|||| 
Db 1096 AQCVDEVNSYACLCVEGYSGQLCEIPP- • -APRSS-CEGTECQNGANCVDQGSRPVCQCL 1151 

Qy 1145 PGYQGERCEKLVSVNFINKESYLQIPSAKVRPQINITLQIATDEDSGILLYKGDRDHIAV 1204 

II: I MMMIMI::::MM : |: 1 1 II I :: I 11:11111 I lllll 
Db 1152 PGFGGPECERLLSVNFVDRDTYLQFIDLQNWPRANITLQVSTAEDNGILLYNGDNDHIAV 1211 

Qy 1205 ELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPRIITNLS 1264 

MM || |J| IhhlMII IIHII || |||: || ::||:|||:| : | 
Db 1212 ELYQGHVRVSYDPGSYPSSAIYSAETINDGQFHTVELVTFDQMVNLSIDGGSPMTMDNFG 1271 

Qy 1265 RQSTLNFDSPLYVGGMPGRSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQRVPMQTG 1324 

I III ::i.!llllll I h I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 I I: I 
Db 1272 RHYTLNSEAPLYVGGMPVDVNSAAFRLWQILNGTSFHGCIRNLYINNELQDFTRTQMRPG 1331 

Qy 1325 ILPGCEPCHRRVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNRCVHGTCLPI 1384 

:MIMM I I II III:: I I |: II I III : II Mill Ml: 
Db 1332 WPGCEPCRRLYCLHGICQPNATPGPVCHCEAGWGGLHCDQPVDGPCHGHRCVHGRCVPL 1391 

Qy 1385 NAFSYSCKCDEGHGGVLCDEEEDLFNPCQAIRCKHGKCRLSGLGQPYCECSSGYTGDSCD 1444 

:| :llhl :|: I II:: : II ::| II I: I :| II |::|: |: 
Db 1392 DALAYSCQCQDGYSGALCNQVGAVAEPCGGLQCLHGHCQASATKGAHCVCSPGFSGELCE 1451 

Qy 1445 REISCRGERIRDYYQKQQGYAACQTTRRVSRLECRGGCAGGQCCGPLRSKRRKYSFECID 1504 

M |||: Ml::: MM IMM :| Mill II II I llll MIIM 
Db 1452 QESECRGDPVRDFHRVQRGYAICQTTRPLSWVECRGACPGQGCCQGLRLKRRKLTFECSD 1511 

Qy 1505 GSSFVDEVERWKCGCTRCV 1524 

MM Mill Mil Ml 
Db 1512 GTSFAEEVEKPTKCGCAQCV 1531 



RESULT 11 
Q9Z166 

ID Q9Z166 PRELIMINARY; PRT; 1025 AA. 
AC Q9Z166; 

DT 01-MAY-1999 (TrEMBLrel. 10, Created) 
DT 01-MAY-1999 (TrEMBLrel. 10, Last sequence update) 
dt 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
DE NEUROGENIC EXTRACELLULAR SLIT PROTEIN (FRAGMENT). 
GN SLIT2. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrate; Euteleostomi; 
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
OX NCBI.TaxID-10090; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-99279238; PubMed-10349621; 

RA Holmes G.P., Negus R., Burridge L., Raman S., Algar E., Yamada T. , 
RA Little M.H.; 

RT "Distinct but overlapping expression patterns of two vertebrate slit 
RT homologs implies functional roles in CNS development and 
RT organogenesis."; 

Mech. Dev. 79:57-72(1998). 
EMBL; AF074960; AAD04345.1; -. 
HSSP; P00743; 1CCF. 
MGD; MGI: 1315205; Slit2. 
INTERPRO; IPR000152; 



INTERPRO; IPR000359 

INTERPRO; IPR000372 

INTERPRO; IPR000483 

INTERPRO; IPR000561 

INTERPRO; IPR000742 

INTERPRO; IPR001010 
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DR INTERPRO; IPR001438; -. 

DR INTERPRO; IPR001611; -. 

DR INTERPRO; IPR001791; 

DR INTERPRO; IPR001881; -. 

DR INTERPRO; IPR002049; ■. 

DR PFAM; PF00008; EGF; 9. 

DR PFAM; PF00054; laminin.G; 1. 

DR PFAM; PF00560; LRR; 7, 

DR PFAM; PF01462; LRRNT; 2. 

DR PFAM; PF01463; LRRCT; 2. 

DR PRINTS; PRQ0010; EGFBLOOD, 

DR PRINTS; PR00011; EGFLAMININ. 

DR PRINTS; PR00019; LEURICHRPT. 

DR PRINTS; PR00287; THIONIN. 

DR PROSITE; PS00010; ASXJYDROXYL; UNKNOWNJ. 

DR PROSITE; PS00022; EGPJL; UMNOWNJ. 

DR PROSITE; PS01185; CTCU; UNKNOWN 1. 

§ PROSITE; PS01186; EGF.2; 7. 
PROSITE; PS01187; EGF.CA; 2. 
PROSITE; PS01225; CTCK.2; 1. 

KW Glycoprotein; EGF- like domain. 

FT NONJER 1 1 

SQ SEQUENCE 1025 AA; 112974 MW; 46CD0D5B7246FC72 CRC64 ; 



Query Match 66.5*; Score 5530; 

Best Local Similarity 95.9%; Pred. No. 0; 



3 11; Length 1025; 



Matches 


Qy 




Db 


1 1 


Ov 


561 


Db 


61 


Qy 


621 


Db 


121 


Qy 


681 


Db 


181 




741 


1 


241 


Qy 


801 


Db 


301 


Qy 


861 


Db 


361 


Qy 


921 


Db 


421 


Qy 


981 


Db 


481 


Qy 


1041 


Db 


541 


Qy 


1101 


Db 


601 



Gaps 



M I ) M I i I M i 1 M 1 1 1 : : 1 1 1 1 1 ! 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 I 



mimiimimimiiiiimimimiiiiimimiiimmiii 



lllll lllllllllllllllllllll lllllllllllllllllhllllllllhlll 



VTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCPTECTCLDTWRCSNRG 740 

lllllllllllllllllllllllllllllllllllllllllllhlllllll NIMH 



iiimiimimiiimiiiiiiMiimimiiimiimii ilium 



1 1 1 1 M II 1 1 1 1 L ! II 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M ! I r I M M I f 1 1 1 1 ! I M 1 1 11 



miiimiimimiiiiiiiimimiiiiiiimiii:!: i muni 



1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 r 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M llll 



1 1 1 M 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II M 1 1 M 1 1 II M I M II 1 1 1 



M 1 1 II r 1 1 1 1 1 ! I f 1 1 1 1 1 M 1 1 : 1 1 M 1 1 M M II M 1 1 1 1 1 1 1 f 1 11 1 M M I : M I 



iimiiiimiiiiiimiiimiiiimiimimiiii imimiiii 



Qy 


1161 


Db 


661 


Qy 


1221 


Db 


721 


Qy 


1281 


Db 


781 


Qy 


1341 


Db 


841 


Qy 


1401 




901 


Qy 


1461 


Db 


961 


Qy 


1521 


Db 


1021 



iimiimimiiiiiiimmiiimiiiimiiiiiiiiimiiiiii 



immiimiiimmi n iiiiiiiihmiiimiimiiiimm 



immmiimminimmmimmmiiiiiiiiimmim 



1 1 1 1 1 : 1 ht.l 1 1 : 1 1 1 1 1 1 1 ! 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 !! t ! i I E I 



1 1 1 1 1 1 1 1 1 1 t'l I 1 1 1 1 1 III 1 1 1 1 : 1 1 1 1 1 1 1 = II: 1 1 1 1 1 1 1 III III 1 1 1 III III 



iiiiiiMmiiiimiiimmiiiiiiiiiiimiiiiiiiiiimimii 



ii i 



PRELIMINARY; PRT; 1530 AA. 



RESULT 12 
Q9WUG5 

ID Q9WUG5 

AC Q9WUG5; 

DT 01-NOV-1999 (TrEMBLrel. 12, Created) 

DT 01-NOV-1999 (TrEMBLrel. 12, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE SLIT2. 

OS Rattus norvegicus (Rat). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

OX NCBI_TaxID-10116; 

RN [1] 

RP SEQUENCE FROM N.A, 

RX MEDLINE-99200391; PubMed-10102268; 

RA Brose K., Bland K.S., Wang K.H., Arnott D., Henzel W., Goodman C.S., 

RA Tessier-Lavigne. M. , Kidd I . ; 

RT "Slit proteins bind Robo receptors and have an evolutionary 

RT conserved role i ; n repulsive axon guidance."; 

RL Cell 96:795-806(1999). 

DR EMBL; AP133730; AAD25540.1; -. 

DR HSSP; P00743-; 1CCF. 

DR INTERPRO; IPR000152; 

DR INTERPRO; IPR000359; 

DR INTERPRO; IPR000372, 

DR INTERPRO; IPR000483, 

DR INTERPRO; IPR000561, 

DR INTERPRO; IPR000742, 

DR INTERPRO; IPR001438 

DR INTERPRO; IPR001611, 

DR INTERPRO; IPR001791, 

DR INTERPRO; IPR001881, 

DR INTERPRO; IPR002049; 

DR PFAM; PF00008; EGF; 9. 

DR PFAM; PF00054; l'aminin_G; 1. 

DR PFAM; PF00560; LRR; 18. 

DR PFAM; PF01462; LRRNT; 3. 

DR PFAM; PF01463; I3RRCT ; 4, 

DR PRINTS; PR00010;] EGFBLOOD, 

DR PRINTS; PR00011;- EGFLAMININ. 

DR PRINTS; PR00019;! LEURICHRPT . 

DR PROSITE; PS00010; ASXJYDROXYL; DNKNOWNJ. 

DR PROSITE; PS00022; EGF_1; ONKNOWNJ. 

DR PROSITE; PS01185; CTCK_1; UNKNOWN.!, 
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DR PROSITE; PS01186; EGF.2; 8. 

DR PROSITE; PS01187; EGF.CA; 2, 

DR PROSITE; PS01225; CTCK.2; 1. 

KW Glycoprotein; EGF-like domain. 

SQ SEQUENCE 1530 AA; 167385 MW; 622A510E9ACC9B5F CRC64; 



■Query Match 66.1%; Score 5497; DB 11; Length 1530; 

Best Local Similarity 64,4%; Pred. No. 0; 

Matches 978; Conservative 230; Mismatches 291; Indels 20; Gaps 7; 

Qy 15 LVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERLDLNGNNITRITKTD 74 

I: I - Mil 1 : 1 : 1 : 1 1 1 1 1 1 I : : : 1 : 1 1 1 1 1 1 1 1 1 : ! 1 1 1 1 1 1 I I I 
Db 21 LLWAAAWRLGATACPALCTCTGTTVDCHGTGLQAIPKNIPRNTERLELNGNNITWIHKND 80 

Qy 75 FAGLRHLRVLQLMENRISTIERGAFQDLRELERLRLNRNHIflLFPELLFLGTARLYRLDL 134 

Mil: Mlllllll I :l III hllll :llll II: Mill I Mil 
Db 81 FAGLRQLRVLQLMENPIGAVEPGAFDDMELEPFQLNRNQIfiMLPELLFQNNQALSRLDL 140 

Qy 135 SENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVLTLNNNNITRLSVAS 194 
_ III :||:||llllll hllllll lllllll.-lllllll llllllllllll : |:| 

m 141 SENSLQAVPRKAFRGATDLKNLQLDKNQISCIEEGAFRALRGLEVLTLNNNNITTIPVSS 200 

Wl 195 FNHMPKLRTFRLHSNNLYCDCHLAWLSDKLRKRPRVGLYTQCMGPSHLRGHNVAEVQKRE 254 

llllllllllllllhhlllllllll llhll Hhlll lh III Mill I 
201 FNHMPKLRTFRLHSNHLFCDCHLAWLSQWLRQRPTIGLFTQCSGPASLRGLNVAEVQKSE 260 



Db 

Qy 255 FVCSDEEEGHQSFMAPSCSVL-HCPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLE 312 

111:1 I hi:: III hill llllllllll II llllhllilll 
Db 261 FSCSGQGEAAQ- ■ -VPACTLSSGSCPAMCSCSNGIVDCRGRGLTAIPANLPETMTEIRLE 317 

Qy 313 QNTIRVIPPGAFSPYRRLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNRITELPKSLF 372 

I II lllllllll:|lllllllllll:|:llllllllllllllllllllll:ll: :l 
Db 318 LNGIRSIPPGAFSPYRRLRRIDLSNNQIAEIAPDAFQGLRSLNSIVLYGNRITDLPRGVF 377 

> Qy 373 EGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLA 432 
l|::||IHIIIIIII|:| llllll 1 1 : 1 1 1 1 1 1 1 1 : 1 :: 1 1 1 1 1 : lllllhlll 
Db 378 GGLYTLQLLLLNANKINCIRPDAFQDLQNLSLLSLYDNKIQSLAKGTFTSLRAIQTLHLA 437 

Qy 433 QNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKKFRCS G 484 

. 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 : 1 IIIIIIMIII 1 1 1 1 1 1 1 1 II I ! 1 1 1 1 M 1 1 1 I 
Db 438 QNPFICDCNLKWLADFLRTNPIETTGARCASPRRLANKRIGQIKSKKFRCSAKEQYFIPG 497 

Qy 485 TEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNEFTVL 544 

Mil I: :| :|:NI Mill : Ml hill III I hllllll ::| 
Db 498 TEDYH--INSECTSDVACPHRCRCEASVVECSGLRLSRIPERIPQSTTELRLNNNEISIL 555 



I 

Db 
oy 



545 EATGIFRKLPQLRKINFSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHRMFKGLESL 604 

lllhlll! hill lll:::|hl III): hh ll:|:||:|: ||:||: I 
556 EATGLFRRLSHLRRINLSNNRVSEIEDGTFEGAISVSELHLTANQLESVRSGMFRGLDGL 615 

605 KTLMLRSNRITCVGNDSFIGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNC 664 

: 1 1 1 1 : 1 1 1 : 1 : hi I hlllllll ll|::||||||| : 1 1 1 1 1 1 1 1 1 1 1 1 1 
616 WSLMLRNNRISCIHHDSFTGLRSVRLLSLYDNHITTISPGAFDTLQALSTL1ILLANPFNC 675 

665 NCYLAWLGEWLRKKRIVTGNPRCQRPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCP 724 

II lllhlllhhllllhl | i I : : 1 1 : 1 1 1 1 I hh :: I I hi 
676 NCQLAWLGDWLRRRRIVTGNPRCQNPDFLRQIPLQDVAFPDFRCEEGQEEVGCLPRPQCP 735 

725 TECTCLDTWRCSNKGLKVLPKGIPRDVTELYLDGNQFTLVPKELSNYKHLTLIDLSKNR 784 

II llllllllll! hh llhhlhlhhhlhl hi :|:| 1 : 1 1 1 1 1 : 
736 QECACLDTWRCSNKHLQLL-KGIPKNVTELYLDGNQFTLVPGQLSTFKYLQLVDLSNNK 794 



Qy 785 ISTLSNQSFSNMTQLLTLILSYNRLRCIPPRTFDGLRSLRLLSLHGNDISVVPEGAFNDL 844 

11:111 1 1 : M : 1 1 llllll hllll I 1 1 : 1 1 1 1 1 1 1 1 1 1 1 : 1 : || | |: 
Db 795 ISSLSNSSFTNMSQLTTLILSYNALQCIPPLAFQGLRSLRLLSLHGNDVSTLQEGIFADV 854 

Qy 845 SALSHLAIGANPLYCDCNMQWLSDWVRSEYREPGIARCAGPGEMADRLLLTTPSRKFTCQ 904 

::|||||||||IIIIII:::IM III: llllllllllll II 1 1 1 M M : 1 1 1 II 
Db 855 TSLSHLAIGANPLYCDCHLRWLSSWVRTGYREPGIARCAGPPEMEGRLLLTTPARRFECQ 914 

Qy 905 GPVDVNIURCNPCLSNPCKNDGTCNSDPTOFYRCTCPYGFRGQDCDVPIHACISNPCRH 964 

II : : llhllllllh 1 11 : : 1 1 : : llllll |:||::|:| : :| Nil : 



Db 915 GPPSLAVQARCDPCLSNPCQNQGTCHNDPLEVYRCTCPSGYRGRNCEVSLDSCSSNPCGN 974 

Qy 965 GGTCHLKEGEEDGFWCICADGFEGENCEVNVDDCEDNDCENNSTCVDGINNYTCLCPPEY 1024 

Mill hlh II I I INI I :| III hi I Hill HH || h 
Db 975 GGTCHAQEGEDAGFTCSCPSGFEGLTCGMNTDDCVRHDCVNGGVCVDGIGNYTCQCPLQY 1034 

Qy 1025 TGELCEERLDFCAQDLNPCQHDSRCILTPRGFRCDCTPGYVGEHCDIDFDDCQDNRCRNG 1084 

II II: h'h |||||||;::|: ||:| :|:| III |::| : ll|:|::|:|l 
Db 1035 TGRACEQLVDFCSPDLNPCQHEAQCVGTPEGPRCECVPGYTGDNCSRNQDDCKDHQCQNG 1094 

Qy 1085 AHCTDAVNGYTCICPEGYSGLFCEFSPPMVLPRTSPCDNFDCQNGAQCIVRINEPICQCL 1144 

I I I :| hlh hh II I II I I: hllll |: : : hllll 
Db 1095 AQCVDEINSYACLCAEGYSGQLCEIPP---APRNS-CEGTECQNGANCVDQGSRPVCQCL 1150 

Qy 1145 PGYQGERCERLVSVNFINRESYLQIPSARVRPQTNITLQIATDEDSGILLYRGDRDHIAV 1204 

II: I :|ill:||||:::::IH : |: |||||::| 1 1 : 1 1 1 1 1 II Mill 
Db 1151 PGFGGPECERLLSVNFVDRDTYLQFTDLQNWPRANITLQVSTAEDNGILLYNGDNDHIAV 1210 

Qy 1205 ELYRGRVRASYDTGSHPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPRIITNLS 1264 

111:1 II III Ihhlllll llllll II hh I ::| :|||:| : I 
Db 1211 ELYQGHVRVSYDPGSYPSSAIYSAETINDGQFHTVRLVTFDQMVNLFIDGGSPMTMDNFG 1270 

Qy 1265 RQSTLNFDSPLYVGGMPGRSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQRVPMQTG 1324 

I III : h hlh I |: I llllllllllllllhlllll I I: I 
Db 1271 RHYTLNSEGPPSVGGMPVDVNSAAFRLWQILNGTSFHGCIRNLYINNELQDFTRTQMRPG 1330 

Qy 1325 ILPGCEPCHRRVCAHGTCQPSSQAGFTCECQEGWMGPLCDQRTNDPCLGNRCVHGTCLPI 1384 

-llllll I I II llh: I I h II III : II Ihllll |:|: 
Db 1331 WPGCEPCRRLYCLHGICQPNATPGPVCHCEAGWGGLHCDQPVDGPCHGHRCVHGRCVPL 1390 

Qy 1385 NAFSYSCRCLL'GHGGVLCDEEEDLFNPCQAIRCRHGRCRLSGLGQPYCECSSGYTGDSCD 1444 

:| : 1 1 1 : 1 !|: I II:: : II ::| II I: I :| II |::|: I: 
Db 1391 DALAYSCQCQDGYSGALCNQVGAVAEPCGGLQCLHGHCQASVTRGAHCVCSPGFSGELCE 1450 

Qy 1445 REISCRGERIRDYYQRQQGYAACQTTRRVSRLECRGGCAGGQCCGPLRSRRRRYSFECTD 1504 

:| III: :lh:: Mil hlh :| hill llllll Ml MM 
Db 1451 QESECRGDPVRDFHRVQRGYAICQTTRPLSWVECRGACPGQGCCQGLRLRRRRLTFECSD 1510 

Qy 1505 GSSFVDEVEKWRCGCTRC 1523 

hh hill Mil I 
Db 1511 GTSFAEEVEKPTKCGCAPC 1529 
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Q9WVC1 

ID Q9WVC1 PRELIMINARY; PRT; 796 AA. 

AC Q9WVC1; • 

DT 01-NOV-1999 (TrEMBLrel. 12, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-JDN-2000 (TrEMBLrel. 14, Last annotation update) 

DE SLIT-2 . 

OS Rattus norvegicUs (Rat). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Maimalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

OX NCBI.TaxID-10116; 

RN [1] 

RP SEQUENCE FROM NiA. 

RC STRAIN-SPRAGOE DAWLEY; 

RX MEDLINE-99292758; PubMed-10364234; 

RA Liang Y., Annan ■■R.S., Carr S.A., Popp S., Mevissen M., Margolis R.R., 

RA Margolis R.U.; - 

RT "Mammalian homologues of the Drosophila slit protein are ligands of 

RT the heparan sulfate proteoglycan glypican-1 in brain,"; 

RL J. Biol. Chem. 274:17885-17892(1999). 
EMBL; AF141386; l :AAD38940.2; -. 



10; IPR000372; ■. 
\0', IPR000483; ■ 
10; IPR001611; ■ 
JO; IPR002272; ■. 
PFAM; PF00560; LRR; 14. 
PFAM; PF01462; LRRNT; 4. 
PFAM; PF01463; LRRCT; 3. 
PRINTS; PR00019; LEDRICHRPT. 



INTERPRO; 
INTERPRO; 
INTERPRO; 
INTERPRO; 



Best Available Copy 
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PRINTS; PR01143; FSHRECEPTOR. 

SEQUENCE 796 AA; 89542 MW; OF50806FC0345005 CRC64; 



Query Match 47.3%; Score 3934; DB 11; Length 796; 

Best Local Similarity 96.5%; Pred. No. 0; 

Matches 740; Conservative 15; Mismatches 12; Indels 0; Gaps 

1 MRGVGWQMLSLSLGLVLAILNKVAPQACPAQCSCSGSTVDCHGLALRSVPRNIPRNTERIi 60 

:IM Mllll llhlllllll lllllllllllllllllllll IIIIMIIIIII 
1 MSGIGWQTLSLSIiALVLSILNKVAPHACPAQCSCSGSTVDCHGLALRIVPRNIPRNTERL 60 



Qy 



DLNGNNITRITKTDFAGLRHLRVLQLMENKISTIERGAFQDLKELERLRLNRNHLQLFPE 120 

lllllllllllljlllllllhllllllllllllllll llllllllllllhlllll! 

DLNGNNITRITKTDFAGLRHLRILQLMENKISTIERGAFHDLKELERLRLNRNNLQLFPE 120 



LLFLGTAKLYRLDlSENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 
nilllltlllll IIIIIIIIIIIIIIIIMNIIMIIIIIIIMIIIIIIIIIIIII 
LLFLGTAKLYRLDLSENQIQAIPRKAFRGAVDIKNLQLDYNQISCIEDGAFRALRDLEVL 180 

ILNNNNITRLSVASFNHMPKLRIFRLHSNNLYCDCHLAWLSDWLRRRPRVGLYTQCMGPS 24 0 

IIIIIIINIIIIIJIIIIIIIIIIIIIIIMIIIIIIIIIIIIhllllllllllllll 

TLNNNNITRLSVASFNHMPKLRTFRLHSNNLYCDCHLAWLSDWLRQRPRVGLYTQCMGPS 24 0 

HLRGHNVAEVQKREFVCSDEEEGHQSFMAPSCSVLHCPAACTCSNNIVDCRGKGLTEIPT 300 
llllllllllllllllllllllllllllllllllllll llllllllllllimilll 
HLRGHNVAWQRREFVCSDEEEGHQSFMAPSCSVLHCPIACTCSNNIVDCRGKGLTEIPT 300 

NLPET ITE IRLEQNT IKVI PPGAFSPYRRLRRIDLSNNQISELAPDAFQGLRSINSLVL J 360 

IIIIMIIIilll[:|:||||||IMNMII:||l!lllll!ll!ll!IIIIIIIIM 
NLPET ITEIRLEQNSIRVIPPGAFSPYRRLRRLDLSNNQISELAPDAFQGLRSLNSLVLY 360 

GNKITELPKSLFEGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKIiQTIAKGTF 420 
!MIIIIIIIIIIII!III!I!IIM!IIIII!IIIIIII!I!IIIIII!I!II;|||| 
GNKITELPKSLFEGLFSLQLLLLNANKINCLRVDAFQDLHNLNLLSLYDNKLQTVAKGTF 420 

SPLRAIQTMHLAQNPFICDCHLKHLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKKF 480 

i iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiinii 

SALRAIQTMHLAQNPFICDCHLKWLADYLHTNPIETSGARCTSPRRLANKRIGQIKSKKF 480 
RCSGTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQKLNKIPEHIPQYTAELRLNNNE 54 0 

i!llllllll!ll!llllll!lllllll!lllllllllllllll:||ll!llll!lllll 

RCSGTEDYRSKLSGDCFADLACPEKCRCEGTTVDCSNQRLNKIPDHIPQYTAELRLNNNE 540 

FTVLEATGIFKKLPQLRKINFSNMnDIEEGAFEGASGVNEILLTSNRLENVQHKMFKG 600 
lllllllllimilllll! Illlllllllllllllllllllllllllllllllllll! 
FTVLEATGIFKRLPQLRKINLSNNKITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKG 600 

I 

LESLKTLMLRSNRITCVGNDSFlGLSSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLAN 660 

!MI!!I!!IIII!:||||||| II IIIIIIIIIIIIIIIIIIM Mlllllllllll 

LESLKTLMLRSNRISCVGNDSFTGLGSVRLLSLYDNQITTVAPGAFGTLHSLSTLNLLAN 660 
PFNCNCYLAWLGEWLRKKRIVTGNPRCQKPYFLKEIPIQDVAIQDPTCDDGNDDNSCSPL 720 

lllllhllllllllhlllllllllllllllllllllllllllllllllllllllllll 

PFNCNCHLAWLGEWLRRKRIVTGNPRCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPL 720 



721 SRCPTECTCLDTWRCSNKGLKVLPRGIPRDVTELYLDGNQFTLVPR 767 

ll|:|||||||||||||||||||||||||||||||IMIIIIII|: 
721 SRCPSECTCLDTWRCSNKGLRVLPRGIPRDVTELYLDGNQFTLVPE 767 



RESULT 14 
Q9V7F9 

ID Q9V7F9 PRELIMINARY; PRT; 1504 AA. 
AC Q9V7F9; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT iOl-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT I01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE jSLI PROTEIN. 

GN SLI. 

OS Drosophiia melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 



Ephydroidea; Drosophilidae; Drosophiia. 

NCBI_TaxID-7227; 

[1] 

SEQUENCE FROM N.A. 
STRAIN-BERKELEY.; 

MEDLINE-20196005; PubMed-10731132; 

Adams M.D., Celniker S.E., Holt R.A., Evans CA,, Gocayne J.D,, 
Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 
George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N. f 
Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q,, Chen L.X., 
Brandon R,c, Rogers Y.-H.C, Blazej R.G., Champe M., Pfeiffer B.D., 
Wan R.H., Doyle- .C, Baxter E.G., Kelt G., Nelson C.R., Miklos G.L.G., 
Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch c, Baldwin D., 
Ballew r.m,, Basu A., Baxendale J., Bayraktaroglu L,, Beasley E.M., 
Beeson K.Y., Benos P.V., Berman B..P., Bhandari D., Bolshakov S,, 
Borkova D,, Botchan M.R., Bouck J., Brokstein P., Brottier P., 
Burtis K.C'Busam D.A., Butler H., Cadieu E., Center A,, Chandra I., 
Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 
de Pablos B., Delcher A., Deng z., Mays A.D., Dew I., Dietz S.M., 
Dodson K. , Doup ; L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P., 
Durbin K.J., Evangelista C.C, Ferraz C, Ferriera S., Fleischmann W., 
Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K,, 
Glodek A., Gong;F., Gorrell J.H., Gu Z., Guan P,, Harris M, , 
Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 
Hostin D,, Houston R.A., Howland T.J., Wei M.-H,, Ibegwam C, 
Jalali M., Ralush F., Karpen G.H., Ke Z,, Kennison J. A,, Retchum K.A., 
Rimmel B.E., Rodira CD., Rraft C, Rravitz S., Kulp D., Lai Z., 
Lasko P., Lei Y,, Levitsky A.A., Li J., Li Z., Liang Y., Lin X., 
Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 
Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 
Mount S.M., Moy M., Murphy B., Murphy L., Muzny D.M., Nelson D.L., 
Nelson D.R., Nelson K.A., Nixon R,, Nusskern D.R., Pacleb J.M., 
Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 
Reinert R., Remington K. , Saunders R.D.C., Scheeler F., Shen H., 
Shue B.C., Siden-Riamos I., Simpson M., Skupski M.P., Smith T,, 
Spier E,, Spradling A.C., Stapleton M., Strong R., Sun E., 
Svirskas R., Tector C, Turner R,, Venter E., Wang A.H., Wang X., 
Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 
Williams S.M., Woodage T., Worley R.C., Wu D. , Yang S;, Yao Q.A., 
Ye J., Yeh r.-f,, Zaveri J.S., Zhan M. , Zhang G. ; Zhao Q., Zheng L., 
Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith K.C., 
Gibbs R.A., Myers E.W,, Rubin G.M., Venter J.C.; 
"The genome sequence of Drosophiia melanogaster,"; 
Science 287:2185-2195(2000). 
EMBL; AE003809;'AAF58097,1; -. 
HSSP; P00743; 1CCF. 
FLYBASE; FBgn0003425; sli. 
INTERPRO; IPR000152; -. 
INTERPRO; IPR000359; 
INTERPRO; IPR00p372; ■. 
INTERPRO; IPR000483; ■. 
INTERPRO; IPR000561; -. 
INTERPRO; IPR0Q0742; -. 
INTERPRO; IPR001611; -. 
INTERPRO; IPR001791; -. 
INTERPRO; IPR0018B1; -. 
INTERPRO; IPR002049; -. 
PFAM; PF00007; Cysjcnot; 1. 
PFAM; PF00008; EGF; 7. 
PFAM; PF00054; laminin.G; 1. 
PFAM; PF00560; XRR; 17. 
PFAM; PF01462; LRRNT; 4. 
PFAM; PF01463; LRRCT; 4. 
PRINTS; PR00011-; EGFLAMININ. 
PRINTS; PR00019:; LEURICHRPT. 
PROS ITE; PS00010; ASXJYDROXYL; 3, 
PROSITE; PS00022; EGF.l; 7. 
PROSITE; PS01185; CTCRJ; 1. 
PROSITE; PS0118S; EGF_2; 5. 
PROSITE; PS01187; EGF_CA; 2. 
PROSITE; PS01225; CTCKJ; 1. 

SEQUENCE 1504' AA; 168597 MW; 836A3F5022BF234F CRC64; 
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Query Match 43.1%; Score 3588; DB 5; Length 1504; 

Best Local Similarity 44.9%; Pred. No. 2.6e-277; 

Matches 677; Conservative 273; Mismatches 466; Indels 92; Gaps 21; 

Qy 28 CPAQCSCSGSTVDCHGLALRSVPRNIPRKTERLDLNGNNITRITKTDFAGLRHLRVLQLM 87 

, II MM III I MM | ; M;l MM I MM I MM 
Db 73 CPRVCSCTGLNVDCSHRGLTSVPRKISADVERLELQGNNLTVIYETDPQRLTKLRMLQLT 132 

Qy 88 ENKISTIERGAFQDLKELERLRLNRNHLQLFPELLFLGTAKLYRLDLSENQIQAIPRKAF 147 

MM Nil MM lllllll I I: II M I MMI II : |: I 
Db 133 DNQIHTIERNSFQDLVSLERLRLNNNRLKAIPENFOTSSASLLRLDISNNVITTVGRRVF 192 

Qy 148 RGAVDIKNLQLDlfNQISCIEDGAFRALRDLEVLTLNNNNIIRLSVASFlffiMPKLRTFRLB 207 

Mi :::|||| MM::: ||: | :MMMMMM I : M! II 

Db 193 RGAQSLRSLQLDNNQITCLDEBAFKGLVELEILTLHNNNLTSLPHNIFGGLGRLRALRLS 252 

Qy 208 SNNLYCDCHLAWLSDWLRRRPRVGLYTQCMGPSHLRGHNVAEYQKREFVCSDEEEGHQSF 267 

I llllhlll Ml I: MM II 1:1 III:: M II I 
Db 253 DNPFACDCHLSWLSRFLRSATRLAPYTRCQSPSQLKGQNVADLHDQEFKCSGLTE 307 

Iy 268 MAP-SCSVLH-CPAACTCSNNIVDCRGKGLTEIPTNLPETITEIRLEQNTIKVIPPGAFS 325 
I II I : II I I:: Mill I II M ||: MMMM I Ml :|| 

b 308 HAPMECGAENSCPHPCRCADGIVDCRERSLTSVPVTLPDDTTELRLEQNFITELPPKSFS 367 

Qy 326 PYKKLRRIDLSNNQISELAPDAFQGLRSLNSLVLYGNKITELPKSLFEGLFSLQLLLLNA 385 

::MMMMM I :| II ||: I : M 1 1 1 E 1 1 Ml MM 1 1 1 1 M 1 1 1 
Db 368 SFRRLRRIDLSNNNISRIAHDALSGLRQLTTLVLYGNKIKDLPSGVFKGLGSLQLLLLNA 427 

Qy 386 NRINCLRVWQDLHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHIAQNPFICDCHLKWL 445 

IMMM MMIIMMIMII :|::| III :MMM MMI I II IMMM 
Db 428 NEISCIRKDAFRDLHSLSLLSLYDNNIQSLANGTFDAMKSIKTVHIAKNPFICDCNLRWL 487 

Qy 446 ADYLHTNPIETSGARCTSPRRLANKRIGQIKSKRFRCSGTEDYRSKLSGDCFADLACPEK 505 

Mill llllllllll MM: ;|| :: MM :: I !MIM I I 
Db 488 ADYLHRNPIETSGARCESPRRMHRRRIESLREEKFKCS-WDELRMKLSGECRMDSDCPAM 546 

Qy 506 CRCEGTTVDCSNQRLNKIPEHIPQYTAELRLNNNEFTVLEATGIFKRLPQLRRINFSNNR 565 

I MMMM : I Ml II M || MMI : : |:| Ml I M M 
Db 547 CHCEGTTVDCTGRGLREIPRDIPLHTTELLLNDNELGRISSDGLFGRLPHLVRLELRRNQ 606 

Qy 566 ITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKGLESLKTLMLRSNRITCVGNDSFIGL 625 

M || Mill : I: I |::: : Mil II III 
Db 607 LTGIEPNAFEGASHIQELQLGENRIKEISNKMFLGLHQLKT 647 

Qy 626 SSVRIiLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCYLAWLGEWLRKRRIVTGNP 685 

MMMM I MM: IMMM MIIIMMM Mill : I 
Db 648 LNLYDNQISCVMPGSFEHLNSLTSLNLASNPFNCNCHLAWFAEWLRKRSLNGGAA 702 

1 686 RCQRPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPLSRCPTECTCLDTWRCSNKGLKVLP 745 
II I :::: MM Ml I : I III Mill || M 

703 RCGAPSRVRDVQIKDLPHSEFKCSSENSE-GCLGDGYCPPSCTCTGTWRCSRNQLREIP 761 

Qy 746 RGIPRDVTELYLDGNQFTLVPRE-LSNYKHLTLIDLSNNRISTLSNQSFSNMTQLLTLIL 804 

MM : Mill: I: : I : : Ml MIIIM III MMIM III: 
Db 762 RGIPAETSELYLESNEIEQIHYERIRHLRSLTRLDLSNNQITILSNYTFANLTRLSTLII 821 

Qy 805 SYNRLRCIPPRTFDGLRSLRLLSLHGNDISWPEGAFNDLSALSHLAIGANPLYCDCNMQ 864 

IMMM: || MMMM MMMM II :|:|:|:|:||||||| :: 
Db 822 SYNKLQCLQRHALSGLNNLRVLSLHGNRISMLPEGSFEDLKSLTHIALGSNPLYCDCGLK 881 

Qy 865 WLSDWVRSEYREPGIARCAGPGEMADRLLLTTPSKRFTCQGPVDVNILAKCNPCLSNPCK 924 

I MM :| MMI I M MMMM I IM I MIMM I II: 

Db 882 WFSDWIRLDYVEPGIARCAEPEQMKDRLILSTPSSSFVCRGRVRNDILARCNACFEQPCQ 941 

Qy 925 NDGTCNSDPVDFYRCTCPYGFRGQDCDVPIHACISNPCRHGGTCHLKEGEEDGFWCICAD 984 

I I : I IM I IMM: I I III:: IM II I I II 
Db 942 NQAQCVALPQREYQCLCQPGYHGKHCEFMIDACYGNPCRMATCTVL-EEGRFSCQCAP 999 

Qy 985 GFEGENCEVNVDDC-EDNDCENNSTCVDGINKYTCLCPPEYTGELCEERLDFCAQDLNPC 1043 

I: I II hill : MIMIMM Mill ::|| |: |: M : III 
Db 1000 GYTGARCETNIDDCLGEIKCQNNATCIDGVESYRCECQPGFSGEFCDTKIQFCSPEFNPC 1059 



Qy 1044 QHDSRCILTPKGFRCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCICPEGYS 1103 

: MM . : III MM : MM:: M II M M IM M 
Db 1060 MGARCMDHFTHYSCDCQAGFHGTNCTDNIDDCQNHMCQNGGTCVDGINDYQCRCPDDYT 1119 

Qy 1104 GLFCEFSP- "PMVLPRTSPCDNFDCQNGA- -QCIVRINEPICQCLPGYQGERCEKLVSVN 1159 

I :ll ;l: IMM I :|::| I : :: :|:| III I: II I |:: 

Db 1120 GRYCEGHNMLSMMYPQTSPCQNHECRHGVCFQPNAQGSDYLCRCHPGYTGRWCEYLTSIS 1179 

Qy 1160 FINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYRGDKDHIAVELYRGRVRASYDTGS 1219 

I:: I::::' : II: IM :: I :|||:| I Mill: ll:| III I: 
Db 1180 FVHNNSFVELEPLRTRPEANVTIVFSSAEQNGILMYDGQDAHLAVELFNGRIRVSYDVGN 1239 

Qy 1220 HPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPRIITNLSKQSTLNFDSPLYVGG 1279 

II I M I:: || M Mil: :: :| II I : I I I :|:::|| 

Db 1240 HPVSTMYSFEMVADGRYHAVELLAIRKNFTLRVDRGLARSIINEGSNDYLRLTTPMFLGG 1299 

Qy 1280 MPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQTGILPGCEPCHKKVCAH 1339 

M : ! M III II:: ::|l Mil II III 
Db 1300 LPVDPAQQAYKNWQIRNLTSFKGCMREVWINHRLVDFGNAQRQQRITPGC ALLE 1353 

Qy 1340 GTCQPSSQAGFTCECQEGWMG--PLCDQRTNDPCLGNKCVHGT-CLP-INA-FSYSCRCL 1394 

II ■::::! I : III III M M II I III 
Db 1354 GEQQEEE- DDEQDFMDETPHIREEPVDPCLENRCRRGSRCVPNSNARDGYQCRCR 1407 



Qy 1395 EGHGGVLCDEEEDLFNPCQAIRCRHGRCRLSGLGQPYCECSSGYTGDSCDREISCRGERI 1454 

I I MM I | :| Ml |:: 

Db 1408 HGQRGRYCDQGEGSTEP PTVTAAS TCRKEQV 1438 

Qy 1455 RDYYQRQQGYAACQTTRKVSRLECRGGCAGGQCCGPLRSRRRRYSFECTDGSSFVDEVEK 1514 

1:11 : M:: : : :l III I III Mil |:: :: 
Db 1439 REYYTEND-— CRSRQPLKYAKCVGGC-GNQCCAARIVRRRKVRMVCSNNRRYIRNLDI 1493 

Qy 1515 WKCGCIR 1522 

I MM: 
Db 1494 VRKCGCTK 1501 



RESULT 15 
Q9XYV4 

ID Q9XYV4 PRELIMINARY; 

AC Q9XYV4; 

DT 01-NOV-1999 (TrEMBLrel. 12, Created) 

DT 01-NOV-1999 (TrEMBLrel. 12, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel, 15, Last annotation update) 

DE SLIT PROTEIN. 

GN SLIT. 

OS Drosophila raelanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthxopoda; Trachea ta; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drbsophllidae; Drosophila. 

OX NCBI_TaxID-7227: 

RN [1] 

RP SEQUENCE FROM NiA. 

RX MEDLINE-99200390; PubMed=10102267; 

RA Ridd T., Bland K.S., Goodman C.S.; 

RT "Slit is the midline repellent for the robo receptor in Drosophila."; 

RL Cell 96:785-794(1999). 

DR EMBL; AF126540; AAD26567.1; -. 

DR HSSP; P00740; 1EDM. 

DR INTERPRO; IPR000152; 
DR 



PRT; 1504 AA. 



INTERPRO; IPR000359; 
INTERPRO; IPR000372; 
INTERPRO; IPR000483; 
INTERPRO; IPR000561; 
INTERPRO; IPR000742; 
INTERPRO; IPR001611; 
INTERPRO; IPR001791; 
INTERPRO; IPR001881; 
INTERPRO; IPR002049; 
PFAM; PF00007; Cysjsnot; 1. 
PFAM; PF00008; EGF; 7. 
PFAM; PF00054; iaminin.G; 1. 
PFAM; PF01462; LRRNT; 4. 
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DR PFAM; PF01463; LRRCT; 4. 

DR PRINTS; PR00011; EGFLAMININ. 

DR PRINTS; PR00019; LEORICHRPT. 

DR PROSITE; PSQ0010; ASXJYDROXYL; 0NKKOWK J . 

DR PROSITE; PS00022; EGF_1; CJNKNOWN_7 . 

DR PROSITE; PS01185; CTCK_1 ; 1. 

DR PROSITE; PS01186; EGF_2 ; 5. 

DR PROSITE; PS01187; EGF_CA; 2. 

DR PROSITE; PS01225; CTCKJ; 1. 

kw Glycoprotein; EGF-like domain, 

SQ SEQUENCE 1504 AA; 168569 MW; A377D3BAACB1C743 CRC64; 

Query latch 43,tt; Score 3586; DB 5; Length 1504; 

Best Ldcal Similarity 44.8%; Pred. No. 3.8e-277; 

Matches 676; Conservative 274; Mismatches 466; Indels 92; Gaps 

K28 CPAQCSCSGSTVDCHGLALRSV?RNIPRNTERLDLNGNNITRITKTDFAGLRHLRVIQLM 87 
II II: III I llll I : llhl llhl I MM I MMM 
73 CPRVCSCTGLNVDCSHRGLTSV^RKISADVERLELQGNNLTVIYETDFQRLTKLRMLQLT 132 

Qy 88 ENKISTIERGAFQDLKELERLRiNRNHLQLFPELLFLGTAKLYRLDLSENQIQAIPRRAF 147 

:hl MM :|||| Mill I I |: II :| I llhl II : |: I 
Db 133 DNQIHTIERNSFQDLVSLERLRMNNRLKAIPENFVTSSASLLRLDISNNVITTVGRRVF 192 

Qy 148 RGAVT)IKNLQLDYNQISCIEDGWmRDLEVLTLNNNNITRLSVASFNHMPKLRTFRLB 207 

:N :::IMI llhl::: h I : 1 1 : 1 1 1 1 1 1 1 : 1 I I : MI II 
Db 193 KGAQSLRSLQLDNNQITCLDEHAFKGLVELEILTLNNNNLTSLPHNIFGGLGRLRALRLS 252 

Qy 208 SNNLYCDCHLAWLSDWLRKRPRVGLYTQCMGPSHLRGHNVAEVQKREFVCSDEEEGHQSF 267 

I IMhlll :M lj II: I |;| III:: Ml II I 
Db 253 DNPFACDCHLSWLSRFLRSATRMPYTRCQSPSQLKGQNVADLHDQEFKCSGLTE 307 

Qy 268 MAP-SCSVLH-CPMCTCSNNI7DCRGKGLTEIPTNLPETITEIRLEQNTIKVIPPGAFS 325 
III : II I I:: llll I II :l II: hMIMI I HI :|| 
308 HAPMECGAENSCPHPCRCADGITpCREKSLTSVPVTLPDDTTDVRLEQNFITELPPKSFS 367 



326 pykklrridlsnnqiselapd; 

:::||IHIIII II :l II 
368 SFRRLRRIDLSNNNISRU 



QGLRSLNSLVLYGNKITELPKSLFEGLFSLQLLLLNA 385 

II: I :llllllll :H :hll lllllllll 
iSGLKQLTTLVLYGNKIKDLPSGVFKGLGSLQLLLLNA 427 



386 NKINCLRVDAF^LHNLNLLSLYDNKLQTIAKGTFSPLRAIQTMHLAQNPFICDCHLKWL 445 

1:1:1:1 1 1 1 : 1 1 1 : 1 : 1 1 1 1 1 1 1 :|::| III : : : 1 : 1 : 1 1 1 : 1 1 1 ! 1 1 1 : 1 : 1 1 
428 NEISCIRKDAFRDLHSLSLLSLtDNNIQSLANGTFDAMKSIKTVHLAKNPFICDCNLRWL 487 
I 

446 ADYLHTNPIETSGARCTSPRRIANKRIGQIKSKKFRCSGTEDYRSKLSGDCFADLACPEK 505 

Mill llllllllll 11:1: :ll :: Mhll :: I IMIM I II 
488 ADYLHKNPIETSGARCESPKRMHRRRIESLREEKFKCS-WDELRMKLSGECRMDSDCPAM 546 

506 CRCEGTTVDCSNQRLNKIPEHIPQYTAELRLNNNEFTVLEATGIFKKLPQLRKINFSNNK 565 

I llllllll: : I Ml lj :l II MMI : : hi HI I h 
547 CHCEGTTVDCTGRGLKEIPRDIPLHTTELLLNDNELGRISSDGLFGRLPHLVKLELKRNQ 606 



Qy 865 WLSDWVKSEYKEPGIARCAGPGEMADKLLLTTPSKKFTCQGPVDVNIIAKCNPCLSNPCK 924 

I llhl llllllll I I 111:1 = 111 I hi I :||llll I lh 
Db 882 WFSDWIKLDY,VEPGIARCAEPEQMKDKLILSTPSSSFVCRGRVRNDILAKCNACFEQPCQ 941 

Qy 925 NDGTCNSDPVDFYRCTCPYGFKGQDCDVPIHACISNPCKHGGTCHLKEGEEDGFWCICAD 984 

I I : I ' hi I h h h I II III:: II : II I I II 
Db 942 NQAQCVALPQREYQCLCQPGYHGKHCEFMIDACYGNPCRNNATCTVL--EEGRFSCQCAP 999 

Qy 985 GFEGENCEVN7DDC-EDNDCENNSTCVDGINNYTCLCPPEYTGELCEEKLDFCAQDLNPC 1043 

I: I II hill : 1 : 1 1 : 1 1 : 1 1 : :| I I I ::|l h h lh : III 
Db 1000 GYTGARCETNIDDCLGEIKCQNNATCIDGVESYKCECQPGFSGEFCDTKIQFCSPEFNPC 1059 

Qy 1044 QHDSKCILTPKGFKCDCTPGYVGEHCDIDFDDCQDNKCKNGAHCTDAVNGYTCICPEGYS 1103 

: :|h .' : III |: I :| : llll:: hll I I :| I I lh h 
Db 1060 ANGAKCMDHFTHYSCDCQAGFHGTNCTDNIDDCQNHMCQNGGTCVDGINDYQCRCPDDYT 1119 

Qy 1104 GLFCEFSP-*>MVLPRTSPCDNFDCQNGA--QCIVRINEPICQCLPGYQGEKCEKLVSVN 1159 

I ill j: hllll I :h:l I : :: :|:| III h II I I:: 

Db 1120 GKYCEGHNMISMMYPQTSPCQNHECKHGVCFQPNAQGSDYLCRCHPGYTGKWCEYLTSIS 1179 

Qy 1160 FINKESYLQIPSAKVRPQTNITLQIATDEDSGILLYKGDKDHIAVELYRGRVRASYDTGS 1219 

I:: |::::. : lh hh :: I :|lhl I hlllh Ihl III h 
Db 1180 FVHNNSFVELEPLRTRPEANVTIVFSSAEQNGILMYDGQDAHLAVELFNGRIRVSYDVGN 1239 

Qy 1220 HPASAIYSVETINDGNFHIVELLALDQSLSLSVDGGNPKIITNLSKQSTLNFDSPLYVGG 1279 

II I :H !■ : II :| Mill: :: :| III : I I I :h::ll 

Db 1240 HPVSTMYSFEMVADGKYHAVELLAIKKNFTLRVDRGLARSIINEGSNDYLKLTTPMFLGG 1299 

Qy 1280 MPGKSNVASLRQAPGQNGTSFHGCIRNLYINSELQDFQKVPMQTGILPGCEPCHKKVCAH 1339 

:| :.: :| III lh: ::|| :MI II III 
Db 1300 LPVDPAQQAYKNWQIRNLTSFKGCMKEWINHKLVDFGNAQRQQKITPGC ALLE 1353 

Qy 1340 GTCQPSSQAGFTCECQEGWMG ■ - PLCDQRTNDPCLGNKCVHGT -CLP* INA-FSYSCKCL 1394 

II : :: :l I : llll III h hi II I III 

Db 1354 GEQQEEE- - •> ■ -DDEQDFMDETPHIKEEPVDPCLENKCRRGSRCVPNSNARDGYQCKCK 1407 

Qy 1395 EGHGGVLCDEEEDLFNPCQAIKCKHGKCRLSGLGQPYCECSSGYTGDSCDREISCRGERI 1454 

I I lh I I I :| :ll h: 

Db 1408 HGQRGRYCDQGEGSTEP PTVTAAS TCRKEQV 1438 

Qy 1455 RDYYQKQQGYAACQTTKKVSRLECRGGCAGGQCCGPLRSKRRKYSFECTDGSSFVDEVEK 1514 

hll : - h: : : :| III I III Mil h: :: :: 
Db 1439 REYYTEND — CRSRQPLKYAKCVGGC-GNQCCAAKIVRRRKVRMVCSNNRKYIKNLDI 1493 

Qy 1515 WKCGCTR 1522 

I Mill: ' 
Db 1494 VRKCGCTK 1-501 



Search completed: January 22, 2001, 12:48:00 
Job time: 1681 sec : 



Qy 566 ITDIEEGAFEGASGVNEILLTSNRLENVQHKMFKGLESLKTLMLRSNRITCVGNDSFIGL 625 

M II HUM : i: I h:: : MM II Ml 
Db 607 LTGIEPNAFEGASHIQELQLGENKIKEISNKMFLGLHQLKT 647 

Qy 626 SSVRLLSLYDNQITTVAPGAFDTLHSLSTLNLLANPFNCNCYLAWLGEWLRKKRIVTGNP 685 

1:111111: I MM: 1 : 1 1 : : 1 1 1 Mlllllhlll llllll : I 

Db 648 LNLYDNQISCVMPGSFEHLNSLTSLNLASNPFHCNCHLAWFAEWLRKKSLNGGAA 702 

Qy 686 RCQKPYFLKEIPIQDVAIQDFTCDDGNDDNSCSPIiSRCPTECTCLDTWRCSNKGLKVLP 745 

II I :::: hh M I I : I II III llllll II M 

Db 703 RCGAPSKVRDVQIKDLPHSEFKCSSENSE-GCLGDGYCPPSCTCTGTWRCSRNQLKEIP 761 

Qy 746 KGIPRDVTELYLDGNQFTLVPKE-LSNYKHLTLIDLSNNRISTLSNQSFSNMTQLLTLIL 804 

MM : Mill: I: : I, : : : 1 1 Mlllhh III •\-\.\-\ III: 

Db 762 RGIPAETSELYLESNEIEQIHYERIRHLRSLTRLDLSNNQITILSNYTFANLTKLSTLII 821 

Qy 805 SYNRLRCIPPRTFDGLKSLRLL^LHGNDISWPEGAFNDLSALSHLAIGANPLYCDCNMQ 864 

MIMM: II : [ I : M'l ! I i IhMIIM II MMMMMMIMI :: 

Db 822 SYNKLQCLQRHALSGLNNLRVLSLHGNRISMLPEGSFEDLKSLTHIALGSNPLYCDCGLK 881 
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GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd, 

OM protein ■ protein search, using sw model 

Run on: January 22, 2001, 12:15:50 ; Search time 233.01 Seconds 

(without alignments) 
204.714 Million cell updates/sec 

Title: OS-09-540-245A-15 
Perfect score: 7427 

Sequence: 1 MHPMHPBNHAIARSTSTTNN SCLYAEAGEPAPRQMTAKNT 1395 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

parched: 268485 seqs, 34193795 residues 

Total number of hits satisfying chosen parameters: 268485 

Minimum DB seq length: 0 
Maximum DB seq length; 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 1001 
Listing first 45 summaries 

Database : A_Geneseq_36:* 

1: /SIDSl/gcgdata/geneseq/geneseqp/AA1980.DAT:' 

2 : /SIDSl/gcgdata/geneseq/geneseqp/AAl981 . DAT : ! 

3 : /SlDSl/gcgdata/geneseq/geneseqp/AA1982 .DAT : ! 

4 : /SIDSl/gcgdata/geneseq/geneseqp/AAl983 .DAT : : 

5: /slDSl/gcgdata/geneseq/geneseqp/AA1984.DAT:' 

6 : /SlDSl/gcgdata/geneseq/geneseqp/AAl985 .DAT:' 

7: /SIDSl/gcgdata/geneseq/geneseqp/AAl986.DAT:' 

8: /SIDSl/gcgdata/geneseq/geneseqp/AA1987.DAT:' 

9 : /SIDSl/gcgdata/geneseq/geneseqp/AA1988 .DAT ; ■ 

10 : /SIDSl/gcgdata/geneseq/geneseqp/AA1989 .DAT ; 

11; /SlDSl/gcgdata/geneseq/geneseqp/AA1990.DAT: 

» 12; /SIDSl/gcgdata/geneseq/geneseqp/AA1991,DAT; 

13 : ^SIDSl/gcgdata/geneseq/geneseqp/AAl992 , DAT ; 

14 : /SIDSl/gcgdata/geneseq/geneseqp/AA1993 , DAT ; 

15 : /SIDSl/gcgdata/geneseq/geneseqp/AA1994 .DAT; 

• 16 ; /SIDSl/gcgdata/geneseq/geneseqp/AA1995 .DAT : 

17 : /SIDSl/gcgdata/geneseq/geneseqp/AAl996 .DAT : 

18 : /SlDSl/gcgdata/geneseq/geneseqp/AAl997 .DAT ; 

19 : /SIDSl/gcgdata/geneseq/geneseqp/AA1998 .DAT : 

20 : /SIDSl/gcgdata/geneseq/geneseqp/AAl999 .DAT ; 

21 : /SIDSl/gcgdata/geneseq/geneseqp/AA2000 .DAT ; 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



NO. 


Score 


Match Length 


DB 


ID 


Description 


PR 
. XX 
PA 


14-NOV-1997;, 97US- 0065543 . 


1 


7427 


100.0 


1395 


20 


Y13563 


Drosophila Robo 1 


(REGC ) ONIV CALIFORNIA, 


2 


7427 


100.0 


1395 


20 


Y08401 


Drosophila sp. ROB 


XX 


3 


1788 


24.1 


1380 


20 


Y08402 


Drosophila sp. ROB 


PI 


Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 


4 


1786.5 


24.1 


1381 


20 


Y13564 


Drosophila Robo 2 


XX 




5 


1592 


21.4 


1651 


20 


Y13566 


Human Robo 1 polyp 


DR 


WI; 1999-338008/28, 


6 


1588 


21.4 


1297 


20 


Y13565 


C. elegans Robo po 


DR 


N-PSDB; X55767, 


7 


1588 


21.4 


1297 


20 


Y08403 


C, elegans ROBO pr 


XX 




8 


1584 


21,3 


1649 


20 


Y08404 


Human ROBOl protei 


PI 


Modulation of Robo-Comm polypeptide interactions 


9 


1317.5 


17.7 


753 


20 


W83927 


Human T85 protein. 


XX 


10 


665,5 


9.0 


1571 


19 


W42087 


Human Down syndrom 


PS 


Disclosure; Page 30-33; 56pp; English. 


11 


661 


8.9 


1910 


19 


W42086 


Human Down syndrom 


XX 




12 


639 


8.6 


1728 


12 


R13144 


Deleted in Colorec 


cc 


The invention relates to a method for modulating the amount of Comm 



13 


637.5 


8.5 


1447 


16 


R68553 


14 


637.5 


8,6 


1447 


20 


Y33498 


15 


.633.5 


8,5 


1257 


20 


W74152 


16 


614.5 


8,3 


1018 


15 


R63759 


17 


614.5 


8.3 


1018 


17 


R87028 


18 


610,5 


8,2 


1192 


19 


W57900 


19 


603 


8.1 


1299 


21 


Y40439 


20 


602 


8.1 


1028 


19 


W29667 


21 


588 


7.9 ' 


1018 


18 


W06485 


22 


586.5 


7.9 


1304 


19 


W59994 


23 


579.5 


7.8 


1911 


16 


R71726 


24 


579.5 


7.9 


1911 


18 


W27225 


25 


579.5 


7,8 


1911 


20 


W94027 


26 


573 


7.7 


1496 


20 


r.i01 nift 

WBluJU 


27 


573 


7.7 


1496 


21 


v7Aji e ft 

Y70469 


28 


548 


\ k 


4412 


21 


Y53666 


29 


546 


7.4 


1897 


21 


Y81785 


30 


546 




1897 


21 


Y56100 


31 


545 




434 


20 


Y13567 


32 


545 


1 .3 


434 


20 


YUo4Ui 


33 


529.5 


7,1 


1501 


16 


R72858 


34 


526 


I'l 


1242 


19 


W52287 


35 


510 


6.9 


985 


20 


Y41716 


36 


509 


6.9 


1225 


19 


W52289 


37 


490 


6,6 


3117 


21 


Y53667 


38 


480.5 


6.5 


1070 


18 


W08747 


39 


458 


6,2 


1853 


21 


Y53668 


40 


457 


6.2 


2387 


21 


Y53665 


41 


457 


6.2 


2597 


21 


Y53664 


42 


446 


6.0 


1139 


19 


W37779 


43 


440,5 


5,9 


1125 


19 


W52288 


44 


439 


5.9 


848 


21 


Y88565 


45 


438,5 


5.9 


1251 


19 


W37778 



Deleted in colorec 
Human DCC protein . 
Human Ll cell adhe 
Human contact!'. ;? 
Human contactir.. 
Protein of clors z 
Human Nr-CAM ?rote 
Homo sapiens bi.lt! 5 
Rat contacts lie.: 
Human neurai ceil 
Human PTP-03. Her, 
Human protein tyio 
Human protein tyre 
Melanoma asdecist: 
Human p53 target m 
Sequence gi/101742 
Human protein tyro 
LAR tyrosine phosp 
Human Robo 2 polyp 
Human partial ROBO 
Rat receptor type- 
Rattus norvegicus 
Human PRO860 prote 
Homo sapiens cdo t 
Sequence gi/332818 
Human colon carcin 
Protein 608 sequen 
Mechanical stress 
Mechanical stress 
Rattus norvegicus 
Rattus norvegicus 
Human NCAM 140kD i 
Rattus norvegicus 



RESULT 1 
Y13563 

ID Y13563 standard; Protein; 1395 AA. 
XX 

AC Y13563; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE Drosophila Robo 1 polypeptde, 
XX 

KW Comm polypeptide; Robo polypeptide; commissureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Drosophila sp. ' 
XX 

PN W09925833-A1. 
XX 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; 98WO-OS24327. 
XX 



Mon Jan 22 13:04:15 2001 



us-09-540-245a-15.rag 



CC (commissureless) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Coram polypeptide in contact with the cell, where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Comm in contact with the cell. 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo: Comm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function. 
XX 

SQ Sequence 1395 AA; 



Query Match 100,0%; Score 7427; DB 20; Length 1395; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1395; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 1 MHPMHPENHAIARSTSTTNNPSRSRSSRMWLLPAWLLLVLVASNGLPAVRGQYQSPRIIE 60 

Db 1 mhpmhpenhaiarststtnnpsrsrssrmwllpawlllvlvasnglpavrgqyqspriie 60 

• 61 HPTDLWKKNEPATLNCRVEGRPEPTIEWFRDGEPVSTNERRSHRVQFRDGALFFYRTMQ 120 
61 hptdlwkknepatlnckvegkpeptiewfkdgepvstnekkshrvqfkdgalffyrtmq 120 

Qy 121 GKKEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETALLECGPPK 180 

Db 121 gkkeqdggeywcvaknrvgqavsrhaslqiavlrddfrvepkdtrvakgetallecgppk 180 

Qy 181 GIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKCIAQNLVG 240 

Db 181 gipeptliwikdgvplddlkaisfgassrvrivdggnllisnvepidegnykciaqnlvg 240 

Qy 241 TRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRA 300 

Db 241 tressyaklivqvkpyfmkepkdqvmlygqtatfhcsvggdpppkvlwkkeegnipvsra 300 

Qy 301 RILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPSNRKVGLNG 360 



f 



Db 301 rilhdeksleisnitptdegtyvceahnnvgqisaraslivhappnftkrpsnkkvglng 360 

Qy 361 WQLPCMASGNPPPSVTWTKEGVSTIjkFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCS 420 

Db 361 wqlpcmasgnpppsvfwtkegvstlmfpnsshgrqyvaadgtlqitdvrqedegyyvcs 420 

Qy 421 AFSVVDSSTVRypQVSSVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFH 480 

Db 421 afswdsstvrvflqvssvderpppiiqigpanqtlpkgsvatlpcratgnpsprikwfh 480 

481 DGHAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTSLH 540 

481 dghavqagnrysiiqgsslrvddiqlsdsgtytctasgergetswaatltvekpgstslh 540 

541 RAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIV 600 

541 raadpstypappgtpkvlnvsrtsislrwaksqekpgavgpiigytveyfspdlqtgwiv 600 

601 AAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSA 660 
601 aahrvgdtqvtisgltpgtsyvflvraentqgisvpsglsnviktieadfdaasandlsa 660 
661 ARTLLTGRSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYRDASVPSAQYHSITV 720 

661 artlltgksvelidasainasavrlevmlhvsadekyvegl'rihykdasvpsaqyhsitv 720 
721 MDASAESFWGNLRRYTRYEFFLTPFFETIEGQPSNSRTALTYEDVPSAPPDNIQIGMYN 780 
721 mdasaes f wgnlkkytkyef f ltpf f etiegqpsnsktaltyedvpsappdniqigmyn 7 8 0 
781 QTAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYS 840 
781 qtagwvrwtpppsqhhngnlygykievsagntmkvlanmtlnatttsvllnnlttgavys 840 



84 1 VRLNSFTKAGDGPYSKPISLFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGD 

841 vrlnsftkagdgpyskpislfmdpthhvhpprahpsgthdgrhegqdltyhnngnippgd 

901 INPTTHRKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTKELGHLSWSDNE 

901 inptthkkttdylsgpwlmvlvcivllvlvisaaismvyfkrkhqmtkelghlsvvsdne 

961 ITALNINSRESLWIDHHRGWRTADTDKDSGLSESRLLSHVNSSQSNYNNSDGGTDYAEVD 

IIMIIIIIIillllllllllllllllllllllllllllMIMIIIIIIIIMMIMI 

961 italninskeslwidnhrgwrtadtdkdsglseskllshvnssqsnynnsdggtdyaevd 

1021 TRNLTTFYNCRRSPDNPTPYATTMI IGT SSSETCTKTTS ISADKDSGTHS PYSDAFAGQV 

1021 trnlttfyncrkspdnptpyattmiigtsssetctkttsisadkdsgthspysdafagqv 

1081 PAVPWKSNYLQYPVEPINWSEFLPPPPEHPPPSSTYGYAQGSPESSRKSSKSAGSGIST 

1081 pavpwksnylqypvepinwseflppppehpppsstygyaqgspessrkssksagsgist 

1141 NQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPTNQHF 

1141 nqsilnasihjisssggfsawgvspqyavacppenvysnplsavaggtqnryqitptnqhp 

1201 PQLPAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPM 

1201 pqipayfattgpggavppnhlpfatqrhaaseyqaglnaarcaqsracnscdalatpspm 

1261 QPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSECSDHSRSSQSHRRQLQLEEHGS 

1261 qppppvpvpegwyqpvhpnshpmhptssnhqiyqcssecsdhsrssqshkrqlqleehgs 

1321 SAKQRGGHHRRRAPWQPCMESENENMLAEYEQRQYTSDCCNSSREGDTCSCSEGSCLYA 
1321 

1381 EAGEPAPRQMTAKNT 1395 

1381 eagepaprptaknt 1395 



1080 
1080 
114C 
1140 
1230 
1200 
1260 
1260 
1320 
1320 
1380 



RESULT 2 
Y08401 

ID Y08401 standard; Protein; 1395 AA. 

XX 

AC 108401; 
XX 

DT 24-JOL-1999 (first entry) 
XX 

DE Drosophila sp. ROBOl protein. 
XX 

KW ROBOl; ROB02; roundabout; nerve guidance; human; murine; cell function; 

RW cell morphology; screening assay. 

XX 

OS Drosophila sp. 
XX 

PN WO9920764-A1. 
XX 

PD 29-APR-1999. 
XX 

PF 20-OCT-1998; 98HO-US22164 . 
XX 

PR 14-N0V4997; 
PR 20-OCT-1997; ' 



97US-0971172. 
97DS-0062921. 



PA (REGC ) ONIV CALIFORNIA. 
XX 

PI Goodman CS, Ridd T, Mitchell KJ, Tear G; 
XX 

DR WPI; 1999-312615/26. 

DR N-PSDB; X57250. 
XX 

PT Robo polypeptides, a new immunoglobulin superfamily member 
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PS Claim 1; Page 45-49; 80pp; English. 
XX 

CC This invention describes novel Robo (roundabout) polypeptides, involved 

CC in nerve guidance which neve been isolated from Drosophila sp., 

CC f . elegans, human and murine samples, The products of the invention can 

CC be used to raise anti-Robo antibodies, which can be used to modulate cell 

CC function or morpffilogy. The Robo polynucleotides and fragments are useful 

CC as probes and primers and for production of the Robo polypeptides. The 

CC probes and primers are also useful in screening assays. 

SQ Sequence 1395 AA; 



Query Match 100.04; Score 7427; DB 20; Length 1395; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1395; Conservative 0; Mismatches 0; Indels 0; Gaps 

1 MHPMHPENHAIARSTSTTNNPSRSRSSRMWLLPAWLLLVLVASNGLPAVRGQYQSPRIIE 60 

Db 1 mhpmhpenhaiarststtnnpsrsrssrmwllpawlllvlvasnglpavrgqyqspriie 60 

Qy 61 HPTDLWKKNEPATLNCKVEGKPEPTIEWFKDGEPVSTNEKKSHRVQFKDGALFFYRTMQ 120 
Db 61 hptdlvvkknepatlnckvegkpeptiewfkdgepvstnekkshrvqfkdgalffyrtmq 120 
Qy 121 GKKEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETALLECGPPK 180 



Db 121 gkkeqdggeywcvaknrvgqavsrhaslqiavlrddfrvepkdtrvakgetallecgppk 180 
Qy 181 GIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKCIAQNLVG 240 



Db 181 gipeptliwikdgvplddlkamsfgassrvrivdggnllisnvepidegnykciaqnlvg 240 

Qy 241 TRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRA 300 

Db 241 tressyaklivqvkpyfmkepkdqvmlygqtatfhcsvggdpppkvlwkkeegnipvsra 300 

Qy 301 RILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPSNKKVGLNG 360 

Db 301 rilhdeksleisnitptdegtyvceahnnvgqisaraslivhappnftkrpsnkkvglng 360 

Qy 361 WQLPCHASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCS 420 

Db 361 vvqlpcmasgnpppsvfwtkegvstlmfpnsshgrqyvaadgtlqitdvrqedegyyvcs 420 

£ 421 AFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFH 480 

Db 421 afsvvdsstvrvflqvssvderpppiiqigpanqtlpkgsvatlpcratgnpsprikwfh 480 

Qy 481 DGHAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTSLH 540 

Db 481 dghavqagnrysiiqgsslrvddlqlsdsgtytctasgergetswaatltvekpgstslh 540 

Qy 541 RAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIV 600 

Db 541 raadpstypappgtpkvlnvsrtsislrwaksqekpgavgpiigytveyfspdlqtgwiv 600 

Qy 601 AAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSA 660 

Db 601 aahrvgdtqvtisgltpgtsyvflvraentqgisvpsglsnviktieadfdaasandlsa 660 

Qy 661 ARTLLTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDASVPSAQYHSITV 720 

Db 661 artlltgksvelidasainasavrlewmlhvsadekyveglrihykdasvpsaqyhsitv 720 

Qy 721 MDASAESFWGNLKKYTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYN 780 

Db 721 mdasaesfvvgnlkkytkyeffltpffetiegqpsnsktaltyedvpsappdniqigmyn 780 

Qy 781 QTAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYS 840 

Db 781 qtagwvrwtpppsqhhngnlygykievsagntmkvlanmtlnatttsvllnnlttgavys 840 



Qy 


841 


VRLNSFTKAGDGPYSKPISLFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGD 


900 


Db 


841 


vrlnsftkagdgpyskpislfmdpthhvhpprahpsgthdgrhegqdltyhnngnippgd 


900 


Qy 


901 


INPTTHKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTKELGHLSWSDNE 


960 


Db 


901 


inptthkkttdylsgpwlmvlvcivllvlvisaaismvyfkrkhqmtkelghlsvvsdne 


960 


Qy 


961 


ITALNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGGTDYAEVD 


1020 


Db 


961 


italninskeslwidhhrgwrtadtdkdsglseskllshvnssqsnynnsdggtdyaevd 


1020 


Qy 


1021 


TRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETCTKTTSISADKDSGTHSPYSDAFAGQV 


1080 


Db 


1021 


trnlttfyncrkspdnptpyattmiigtsssetctkttsisadkdsgthspysdafagqv 


1080 


Qy 


1081 


PAVPWKSNYLQYPVEPINWSEFLPPPPEHPPPSSTYGYAQGSPESSRKSSKSAGSGIST 


1140 


Db 


1081 


pavpwksnylqypvepinwseflppppehpppsstygyaqgspessrkssksagsgist 


1140 


Qy 


1141 


NQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPTNQHP 


1200 


Db 


1141 


nqsilnasihssssggfsawgvspqyavacppenvysnplsavaggtqnryqitptnqhp 


1200 


Qy 


1201 


PQLPAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPM 


1260 


Db 


1201 


pqlpayfattgpggavppnhlpfatqrhaaseyqaglnaarcaqsracnscdalatpspm 


1260 


Qy 


1261 


QPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSECSDHSRSSQSHKRQLQLEEHGS 


1320 


Db 


1261 


qppppvpvpegwyqpvhpnshpmhptssnhqiyqcssecsdhsrssqshkrqlqleehgs 


1320 


Qy 


1321 


SAKQRGGHHRRRAPWQPCMESENENMLAEYEQRQYTSDCCNSSREGDTCSCSEGSCLYA 


1380 


Db 


1321 


sakqrgghhrrrapvvqpcmesenenmlaeyeqrqytsdccnssregdtcscsegsclya 


1380 


Qy 


1381 


EAGEPAPRQMTAKNT 1395 




Db 


1381 


eagepaprqmtaknt 1395 





Y08402 

ID Y08402 standard; Protein; 1380 AA. 

AC Y08402; 

XX '■ 

DT 24-JUL-1999 (first entry) 

XX 



Drosophila sp. ROB02 extracellular domain protein. 

)2; roundabout; nerve guidance; human; murine; cell function; 



XX 
KW 

KW cell morphology; screening assay, 
XX 

OS Drosophila sp, !■ 

XX 

PN WO9920764-A1. - 

XX f 

PD 29-APR-1999. , ; 
XX 

PF 20-OCM998; 98WO-US22164. 
XX 

PR 14-NOV-1997; 97US-0971172. 

PR 20-OCM997; 97OS-0062921. 
XX 

PA (REGC ) ONIY CALIFORNIA. 

XX ; 

PI Goodman CS, Kidd T, Mitchell KJ, Tear G; 

XX ! 

DR WPI; 1999-312615/26. 

DR N-PSDB; X57251. 
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PT Robo polypeptides, a new immunoglobulin superfamily member 
XX 

PS Claim 1; Page 52-56; 80pp; English, 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 

XX 
SQ 



This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides. The 
probes and primers are also useful in screening assays . 

Sequence 1380 AA; 



Query Match 24.1*; Score 1788; DB 20; Length 1380; 

Best Local similarity 30.1*; Pred. No. 1.3e-94; 

Hatches 441; Conservative 242; Mismatches 506; Indels 276; Gaps 39; 

0y 54 QSPRIIEHPTDLWKKNEPATLNCKVEGKPEPTIEWPKDGEPVSTNEKRSHRVQFKDGAL 113 
"lllllll I I Ihl I 11= II I llhlllll : I : Ml: I | 

f2 enpriiehpmdttvpkndpftfncqaegnptptiqwfkdgrelkt-dtgshrimlpaggl 60 
114 FFYRTMQGKKEQDGGErwCVAKNRVGQAVSRHASLQIAVLRDDFMPKDTRVAKGETAL 173 

II : i -Ml III III I I ll;|:ll:IIHI:|!:|l illlhll II 

Db 61 fflkvihsrresdagtywceaknefgvarsrnatlqvavlrdefrlepantrvaqgeval 120 

Qy 174 LECGpIkGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIYDGGKLLISNVEPIDEGNYKC 233 

:MI bl III : I hi : :: , : MINIMI I |:| hi 
Db 121 mecgaprgspepqiswrkng qtlnl^gnkririvdggnlaiqearqsddgryqc 174 

Qy 234 IAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWKK-E 291 

: :|:llllll: I I I |:|: :: |::| : I : I I :|||| I |||:: 
Db 175 vvknwgtresataflkvhvrpflirgpqnqtawgsswfqcriggdplpdvlwrrtas 234 

Qy 292 EGNIPVSRARILH DEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVH 342 

Ihh : II :::||:: ::| III III I I |:| I II 

Db 235 ggnmplrkfswlhsasgrvhvledrslklddvtledmgeytceadnavggitatgiltvh 294 

Qy 343 APPNFTKRPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVA— 399 

III I II I: I = I I :hl |:::|: II |:|: I II I 

Db 295 appkfvirpknqlveigdevlfecqanghprptlywsvegnsslllpgyrdgrmevtltp 354 

Qy 400 -ADGTLQITDVRQEDEGYYV-CSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLP 457 

II :M I Ihl : I I : I : I : I llllh II lllll 
Db 355 egrsvlsiarfaredsgkwtcnalnavgsvssrtwsvdtqfelpppiieqgpvnqtlp 414 

Qy 458 KGSVATLPCRATGNPSPRIKWFHDGHA-VQAGNRYSIIQGSSLRVDDLQL-SDSGTYTC 514 
h llll I I h: I: II II I :: :| : III I I III 
415 vksiwlpcrtlgtpvpqvswyldgipidvqeherrnlsdagaltisdlqrhedeglytc 474 

W 515 TASGERGETSWAATLTVEKPGSTSL-HRAADPSTYPAPPGTPKVLNVSRTSISLRWAKS 572 

II 1 : : 1 1 : I :: I : :: II : llll III |::: |::| I :| 
Db 475 vasnrngksswsgylrldtptnpnikffrapelstypgppgkpqmvekgensvtlswtrs 534 

Qy 573 QEKPGAVGPI IGYT VEYFSPDLQTGWIVAAHRVGDTQVT I SGLTPGTSYVFLVRAENTQG 632 

: h ::|l :| I : lh II :| I :|| II :| I f : 1 1 1 1 : I 
Db 535 nkvggs-slvgyviemfgknetdgwvavgtrvqnttftqtgllpgvnyffliraenshg 592 

Qy 633 ISVPSGLSNVIKTIEADFDAASANDLSAAR-TLLTGKSVELIDASAINASAVRLEWMLHV 691 

:hll :l I hi III II :|hl III :|| I : 

Db 593 lslpspmsepitvgtryfn--sgldlsearasllsgdwelsnaswdstsmkltwqi-- 648 

Qy 692 SADEKYVEGLRIHYKDASVP 711 

: lllll :: : I 

Db 649 -ingkyvegfyvyarqlpnpivnnpapvtsntnpllgststsasasasasalistkpnia 707 

Qy 712 SAQ YHS ITVMD * ASAESFWGNLKKYTKYEFFLTPFFET I 750 

: :l :|::: I I : I :|| Mil: ||:::: 
Db 708 aagkrdgetnqsgggaptplntkyrmltilngggassctitglvqytlyeffivpfyksv 767 

Qy 751 EGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYKIEV- - ■ 807 

[Mill: I I lllll I :: : I :| :::| I : :| I I : I 



Db 768 egkpsnsriartledvpseapygmealllnssavflkwkapelkdrhgvllnyhvivrgi 827 

Qy 808 -SAGNTMKV1ANMTLKATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISLFMDP-T 865 

:| I ::hl:h:l : :::l III I :|:| : : II III I :| :|| I 
Db 828 dtahnfsriltnvtidaasptlvlanltegvmytvgvaagnnagvgpycvpatlrldpit 887 

Qy 866 HHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTTHKKTTDYLSGPWLMVLVCIV 925 

: I - Illlhll ::h : 

Db 888 krldp , f inqrdh- -vndvltqpwf iillgai 916 

Qy 926 LLVLVISAAISMVYFKRKHQMTKELGHLSWSDN EITALNINSKESLWIDHHRG 979 

I M::l :lh llll I h h : I :: :|: : hi I 
Db 917 lavlmlsfg-cmvfvkrkhnifkq-salntmrgnhtsdvlkmpslsarngngywldsstg 974 

Qy 980 — WR .-TADTDKD SGLSESKLLSHVN 1001 

II • : : II III:: 
Db 975 pvwrpspggdslemqkdhiadyapvcgapgspagggtssggsggagsgasggddihggh 1034 

Qy 1002 SSQSN -.'-YNNSDGGTDYAEVDTRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETC 1054 

I: I . hi llllll : I :| ||||: h : 

Db 1035 gsernqqryvgeysnip--tdyaevssfgkapseygrhgnaspapyatssilsphqqqqq 1092 

Qy 1055 TKTTSISADKDS - GTHSPY SDAFAGQVPAVPWKSNYLQY PVEPI NHSEFLPP 1106 

: 'II : I ; : |: : : : III 
Db 1093 qqpryqqrpvpgyglqrpmhphyqqqqhqqqqaqqthqqh--qalqqhqqlppsniyqqm 1150 

Qy U07 PPEIIPPPSSTYGYAQGSPESSRKSSKSAGSGISTNQSILNASIHSSSSGGFSA 1159 

I .1 II I: :: I : I : :: I : 

Db 1151 sttseiyptntgpsrsvyseqyyypkdkqrh ihienklsnchtyeaapgakqs 1203 

Qy 1160 WGVSPQYAVA — CPPENVYSNPLSAVAGGTQNRYQITPTNQHPPQLPAY 1206 

:! hi II : -I I::: hi I 

Db 1204 spissqfasvrrqqlpp ncsigresarfkvlntdqgknqqnlldldgssmcy 1255 

Qy 1207 -FATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPMQPPP 1264 

I :! Ih I : : : I : I : : I III 

Db 1256 ngladsgcggspspmamlmshedehalyhta dgdlddmerlyvkvdeqqpp 1306 

Qy 1265 P VPVPEGWYQPVHPNSHPMHPTSSNHQIY QCSSECSD-HSR 1304 

:|: ; I II : I : :| I h :: 

Db 1307 qqqqqliplv pqhpaeghlqswrnqstrssrkngqecikepseliyap 1354 

Qy 1305 SSQSHKRQLQLEEHGSSAKQRGGHH 1329 

I : :| I :: I Ih 
Db 1355 gsvasersllsnsgsgtssqpaghn 1379 



RESULT 4 
Y13564 

ID Y13564 standard; Protein; 1381 AA. 
XX 

AC Y13564; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE Drosophila Robo '2 polypeptde. 

XX 

KW Comm polypeptide; Robo polypeptide; commissureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Drosophila sp, 
XX 

PN W09925833-A1. 
XX 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; 
XX 



14-NOV-1997; 



98WO-US24327. 
97OS-0065543, 



PA (REGC ) ONIV CALIFORNIA. 
XX 

PI Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 



Mon Jan 22 13:04:15 2001 



us-09-540-245a-15.rag 



Page 5 



WPI; 1999 
N-PSDB; X55768, 

Modulation of Robo-Comm polypeptide interactions 

Disclosure; Page 34-38; 56pp; English. 

The invention relates to a method for modulating the amount of Com 
(commissureless) polypeptide in contact with a cell expressing active 
Robo (roundabout) on its surface, The method comprises modulating the 
effective amount of Coram polypeptide in contact with the cell, where the 
amount of expressed active Robo is specifically modulated inversely with 
the modulation of the effective amount of Coiran in contact with the cell . 
The method is used to modulate the amount of active Robo expressed on a 
cell. The method can be used to screen for agents that modulate Robo:Comm 
interactions. This is particularly useful for modulating nerve cell 
function, 

Sequence 1381 AA; 



Query Match 24.1%; Score 1786.5; DB 20; Length 1381; 

Best Local Similarity 30.1%; Pred. No. 1.6e-94; 

Matches 441; Conservative 242; Mismatches 507; Indels 275; Gaps 39; 

Qy 54 QSPR1IEHPTDLWKKNEPATLNCKVEGKPEPTIEWFKDGEPVSTNEKKSHRVQFKDGAL 113 

I I 11:1 I II: II I llhlllll : I : |||: | | 
Db 2 enpriiehpmdttvpkndpftfncqaegnptptiqwfkdgrelkt-dtgshrimlpaggl 60 

Qy 114 FFYRTMQGKKEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETAL 173 

II - -Ml III Ml I I Il:|:||:|||||:||:|| ;||||:|| || 
Db 61 fflkvihsrresdagtywceaknefgvarsrnatlqvavlrdefrlepantrvaqgeval 120 

Qy 174 LECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKC 233 

:MI hi III : I 1:1 : :: : MINIMI I hi hi 
Db 121 mecgaprgspepqiswrkng qtlnlvgnkririvdggnlaiqearqsddgryqc 174 

Qy 234 IAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWKK-E 291 

: :hlllllh I I I hi: |::| : I : I I :|||| | |||:: 
Db 175 vvknwgtresataflkvhvrpflirgpqnqtawgsswfqcriggdplpdvlwrrtas 234 

Qy 292 EGNIPVSRARILH DEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVH 342 

11:1: : II :::lh: ::l III III I II hi I M 

Db 235 ggnmplrkfswlhsasgrvhvledrslklddvtledmgeytceadnavggitatgiltvh 294 

• 343 APPNFTKRPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVA- ■ ■ 399 
, III I II h I : I I I:|:| |:::h II hh I II I 
Db 295 appkfvirpknqlveigdevlfecqanghprptlywsvegnsslllpgyrdgrmevtltp 354 

Qy 400 -ADGTLQITDVRQEDEGYYV-CSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLP 457 

II Ml I I hi : I I : I : I : I 1 1 f 1 1 : M ||||| 
Db 355 egrsvlsiarfaredsgkvvtcnalnavgsvssrtvvsvdtqfelpppiieqgpvnqtlp 414 

Qy 458 KGSVATLPCRATGNPSPRIKWFHDGHA-VQAGNRYSIIQGSSLRVDDLQL-SDSGTYTC 514 

h Mil I I h: I: II II I :: :| : III I I Ml 
Db 415 vksiwlpcrtlgtpvpqvswyldgipidvqeherrnlsdagaltisdlqrhedeglytc 474 

Qy 515 TASGERGETSWAATLTVEKPGSTSL--HRAADPSTYPAPPGTPKVLNVSRTSISLRWAKS 572 

II h:lh I :: I : :: II : Mil III I::: |::| I :| 
Db 475 vasnrngksswsgylrldtptnpnikffrapelstypgppgkpqmvekgensvtlswtrs 534 

Qy 573 QEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQG 632 

: h ::|| :| I : ||: II :| I :|| II :| 1 1 : 1 1 1 1 : I 
Db 535 nkvggs-slvgyviemfgknetdgwvavgtrvqnttftqtgllpgvnyffliraenshg 592 

Qy 633 ISVPSGLSNVIKTIEADFDAASANDIiSAAR-TLLTGKSVEtlDASAINASAVRLEMLHV 691 

:hll Ml hi III II :lhl Ml Ml :::::::| I : 
Db 593 lslpspmsepitvgtryfn--sgldlsearasllsgdwelsnaswdstsmkltwqi- 648 

Qy 692 SADEKYVEGLRIHYKDASVP 711 

: Mill :: : I 

Db 649 -ingkyvegfyvyarqlpnpivnnpapvtsntnpllgststsasasasasalistkpnia 707 



Qy 712 ; SAQYHSITVMD-ASAESFWGNLKKYTKYEFFLTPFFETI 750 

: M :|::: II : I Ml Mil: II:::: 
Db 708 aagkrdgetnqsgggaptplntkyrmltilngggassctitglvqytlyeffivpfyksv 767 

Qy 751 EGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHRNGNLYGYKIEV- * - 807 

11:1111: I I Mill I :: : I :| :::| I : :| I I : I 
Db 768 egkpsnsriartledvpseapygmealllnssavflkwkapelkdrhgvllnyhvivrgi 827 

Qy 808 •SAGNTMKVLAHMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISLFMDP-T 855 

M I ::MM:M : :::| III I :|:| : : II III I M Ml I 
Db 828 dtahnfsriltnvtidaasptlvlanltegvmytvgvaagnnagvgpycvpatlrldpit 887 

Qy 866 HHVBPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTTHKKTTDYLSGPWLMVLYCIV *25 

: I II I I h II :M: : 

Db 888 krldp f inqrdh- -vndvltqpwf iillgai 91c 

Qy 926 LLVLVISAAI SMVYFKRKHQMTKELGHLSWSDN EITALNINSKESLWIDHHRG 979 

I IhM Mh Mil I h I: : I :: :|: : |:| I 
Db 917 lavlmlsfg-amvfvkrkhmmmkq-salntmrgnhtsdvlkmpslsarngngywldsstg 274 

Qy 980 — WR TADTDKD SGLSESKLLSHVN 1001 

II : : II III:: 
Db 975 gmvwrpspggdslemqkdhiadyapvcgapgspagggtssggsggagsgasggddihggh 1034 

Qy 1002 SSQSN YNNSDGGTDYAEVDTRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETC 1054 

h I IM llllll : I M Mil: I: : 

Db 1035 gsernqqryvgeysnip-tdyaevssfgkapseygrhgnaspapyatssilsphqqqqq 1092 

Qy 1055 TKTTSISADKDS-GTHSPYSDAFAGQVPAVPWKSNYLQYPVEPINWSEFLPP 1106 

: .'II : I : : I: : : : III 

Db 1093 qqpryqqrpvpgyglqrpmhphyqqqqhqqqqaqqthqqh-qalqqhqqlppsniyqqm 1150 

Qy 1107 PPEHPPPSSTYGYAQGSPESSRKSSKSAGSGISTNQSILNASIHSSSSGGFSA 1159 

I ■■■ I I I h :: | : | : :: | : 

Db 1151 sttseiyptntgpsrsvyseqyyypkdkqrhih itenklsnchtyeaapgakqs 1204 

Qy 1160 WGVSPQYAVA— -CPPENVYSNPLSAVAGGTQNRYQITPTNQHPPQLPAY 1206 

M 1:1 --II : I |::: |:| I 

Db 1205 spissqfasvrrqqlpp ncsigresarfkvlntdqgknqqnlldldgssmcy 1256 

Qy 1207 --FATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPMQPPP 1264 

I M II: I : : : I : I : : I III 

Db 1257 ngladsgcggspspmamlmshedehalyhta dgdlddmerlyvkvdeqqpp 1307 

Qy 1265 P VPVPEGWYQPVHPNSHPMHPTSSNHQIY QCSSECSD- -HSR 1304 

M: : 111:1: MM: :: 

Db 1308 qqqqqliplv ; pqhpaeghlqswrnqstrssrkngqecikepseliyap 1355 

Qy 1305 SSQSHKRQLQLEEHGSSAKQRGGHH 1329 

I : M I :: I Ih 
Db 1356 gsvasers llshs gsgts sqpaghn 1380 



RESULT 5 
Y13566 

ID Y13566 standard;.' Protein; 1651 AA. 
XX 

AC Y13566; 
XX 

DT 30-JUL-1999 (first entry) 

xx ; 

DE 
XX 

KW Comm polypeptide; Robo polypeptide; commissureless; roundabout; 
KW modulation; nerve cell function. 



Human Robo 1 polypeptde. 



OS Homo sapiens, 
xx 

PN W09925833-A1. 
XX 

PD 27-MAY-1999. 



Best Available Copy 

Mon Jan 22 13:04:15 2001 



us-09-540-245a-15.rag 



13- NOV-1998; 98WO-US24327 . 

14- NOV-1997; 97DS-0065543 . 



(REGC ) UNIV CALIFORNIA. 

Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 



WPI; 1999-338008/28. 
N-PSDB; X55770. 



CC 

I 

CC 
XX 



Modulation of Robo-Comm polypeptide .nteractions 
Disclosure; Page 44-48; 56pp; Englisj. 

The invention relates to a method for modulating the amount of Coimi 
(commissureless) polypeptide in contact with a cell expressing active 
Robo (roundabout) on its surface. The method comprises modulating the 
effective amount of Coram polypeptide 'in contact with the cell, where the 
amount of expressed active Robo is specifically modulated inversely with 
the modulation of the effective amouSt of Comm in contact with the cell. 
The method is used to modulate the dount of active Robo expressed o 
cell. The method can be used to serein for agents that modulate Robo: Comm 
interactions. This is particularly useful for modulating nerve cell 
function. 

Sequence 1651 AA; 



Query Match 21.4*; Score 1592; DB 20; Length 1651; 

Best Local Similarity 30.2%; Pred. Nd*. 3.1e-83; 

Matches 419; Conservative 188; Mismatches 492; Indels 290; Gaps 

Qy 56 PRIIBHPTDLWKKNBPATLNCKVEGKPEPTIEWFKDGEPVST - -NEKKSHRVQFKDGAL 113 

1 1 1 : 1 1 1 : ; I : I I MUM ||:| Mill: III :: :|||: |:| 
Db 68 privehpsdlivskgepatlnckaegrptptiewykggervetdkddprshrmllpsgsl 127 

Qy 114 FFYRTMQGKKEQ-DGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETA 172 

II I : 1:1 : II I llhl : 1 : 1 1 1 : 1 1 1 :: 1 : 1 1 1 1 1 1 II II II I 
Db 128 fflrivhgrksrpdegvyvcvarnylgeavshnaslevailrddfrqnpsdvmvavgepa 187 

Qy 173 LLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYK 232 

Ml 11:1 MM: I III llll |: I II |:|: I I I 

Db 188 vmecqpprghpeptiswkkdgspldd -,-kderiti-rggklmitytrksdagkyv 239 

Qy 233 CIAQNLVGTRESSYAKLIVQVKPYFMKEPKD(!vMLYGQTATFHCSVGGDPPPKVLWKKEE 292 

I: 1:11 III 1:1 I :l hi I :j : :| I I III I I hh: 
Db 240 cvgtnmvgeresevaeltvlerpsfvkrpsnjavtvddsaefkceargdpvptvrwrkdd 299 



rpsnjav 
;tyvJes 



Db 



293 GNIPVSRARILHDERSLEISNITPTDEGTYV^EAHNNVGQISARASLIVHAPPNFTKRPS 352 

1:111 I I: :hl :l I 1:1 I I I lh I hi I 1 1 : 1 :| 

300 gelpksryei-rddhtlkirkvtagdmgsytcvaenmvgkaeasatltvqepphfwkpr 358 

353 NKSVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMF— PNSSHGRQYVAADGTLQITDV 409 

:: I I I I 1:111 l::ll HI hi Nihil Ihl 

359 dqvvalgrtvtfqceatgnpqpaifwrregsqnllfsyqppqsssrfsvsqtgdltitnv 418 

410 RQEDEGYYVCSAFSWDSSTVRVFLQVSSV-DERPPPIIQIGPANQTLPKGSVATLPCRA 468 

:: I HI: «1 1 I : :|:|: I :||||:|: II III: I I I 

419 qrsdvgyyicqtlnvagsiitkaylevtdviadrpppvirqgpvnqtvavdgtfvlscva 478 

469 TGNPSPRIKWFHDGHAVQA-GNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAA 527 

11:1 I I I II I :| :: h: :l hi III II II :hl 

479 tgspvptilwrkdgvlvstqdsrikqlengvlqiryaklgdtgrytciastpsgeatwsa 538 



Qy 528 TLTVEKPG-STSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYT 586 

: h: I I lh h I hi :||| :::| II: I 
Db 539 yievqefgvpvqpprptdpnlipsapskpevtdvsrntvtlsw—qpnlnsgatptsyi 595 

Qy 587 VEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTI 646 

:| II : I I I Ml I hlllll I III II :h :ll 
Db 596 ieafshasgsswqtvaenvktetsaikglkpnaiylflvraanaygisdpsqisdpvktq 655 



Db 



647 EADFDAASAN11LSAARTLLTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYR 706 

: : ! : I : I : : :::|:: : I I :|::| :| h 
656 dv-lptsqgvdhkqvqrelgnavlhlhnptvlssssievhwt-vdqqsqyiqgykilyr 712 



Qy 707 DASVPSAQYHS ITVMDASAESFWGNLKKYTKYEFFLTPFFETIEGQPSNSKTA 760 

III I : I I: :hl II III 

Db 713 ----psganhgesdwlvfevrtpaknsvvipdlrkgvnyeikarpffnefqgadseikfa 768 

Qy 761 LT YEDVPSAPPDMQIGMY - - NQTAGWVRWT PPPSQHHOLYG YKI EVS AGNTMKVLAN 818 

I h Hill : : Ml I I III II : lh II : I 
Db 769 ktleeapsappqgvtvskndgngtailvswqpppedtqngmvqeykvwclgnetryhin 827 

Qy 819 MTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISLFMDPTHHVHPPRAHPSGT 878 

:: :| ||:: I I III : M II I hi : :| II 
Db 828 ktvdgstfswipflvpgirysvevaastgagsgvksepqfiqld ah---- 874 

Qy 879 HDGRHEGQDLTYHNNGN-IPPGDINPTTHKRTTDYLSGP WLMVLVCIVLL 927 

II M I : :: :l : I MM : I 

Db 875 "----gnpvsped-qvslaqqisdwkqpafiagigaacwiilravfsiwl 918 

Qy 928 VLVISAAISMVYFKRKHQ MTKELGHLSWSDNEITALNI 966 

•.III: :| : I :M III 

Db 919 yrhrkkrngltstyagirkvpsftftptvtyqrggeavssggrpgllni 967 

Qy 967 NSKESL-WI- DHHRGWRTADTDKDSGLSESKLLSHVNSSQ - - S NYNNS 1010 

: : h . ::| : :l hi I :: : :IHI 
Db 968 sepaaqpwladtwpntgnnhndcsiscctagngnsdsnlttysrpadcianynnqldnkq 1027 

Qy 1011 DGGTDYAEVDTRNLTTFYNCRKSPD NPTPYATTMIIGTSSSETC 1054 

I i :|| I lh lllllll :! :: I 

Db 1028 tnlmlpestvygdvdlsnkinemktfnspnlkdgrfvnpsgqptpyattqliqsnlsnnm 1087 

Qy 1055 TRTTSISADK* DSGTH 1063 

: I :h M 
Db 1088 nngsgdsgekhwkplgqqkqevapvqyniveqnklnkdyrandtvpptipynqsydqntg 1147 

Qy 1070 SPYSDAFAGQVPAVPWKSNYLQYPVEP — INWSEFLPPPPEHPPPSSTYGYAQGSPE 1125 

I: : M : Ml :lh: Mil Ml I 
Db 1148 gsynssdrgsstsgsqghkkgartpkvpkqggmnwadllppppahppphs 1197 

Qy 1126 SSRKSSKSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVY 1176 

llhl : II :| 

Db 1198 nseeynisvdes ydqempcpvpparmylqqdeleee 1233 

Qy 1177 SNPLSAVAGGTQNRYQITPTNQH--PPQLPAYFATTGPGGAVPPNH 1220 

'■ hMlh h :lh I II II I 
Db 1234 edergptppvrgaassp-aavsyshqstatltpspqeelqpmlqdcpeetg h 1284 

Qy 1221 LPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPMQPPPPVPVPEGWYQPVHPNS 1280 

: I \ I MM I I I I 

Db 1285 mqhqpdrrr-- qpvspppp-prp---ispphtyg 1312 

Qy 1281 HPMHPTSSN 1289 

: I I: •' 
Db 1313 yisgplvsd 1321 



RESULT 6 
Y13565 

ID Y13565 standard; Protein; 1297 AA. 
XX 

AC Y13565; •; 



30-JOL-1999 (first entry) 
C. elegans Robo polypeptde, 

Comm polypeptide; Robo polypeptide; commissureless; roundabout; 
modulation; nerve cell function, 

Caenorhabditis elegans. 

W09925833-A1. ' 



Best Available Copy 

Mon Jan 22 13:04:15 2001 



us-09-540-245a-15.rag 
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xx 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; 98WO-US24327. 
XX 

PR 14-NOV-1997; 97US-0055543. 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 
XX 

DR WPI; 1999-338008/28. 

DR N-PSDB; X55769. 

XX , 

PT Modulation of Robo-Comm polypeptide interactions 

xx * 

PS Disclosure; Page 38-39; 5 6pp ; English. 

•The invention relates to a method for modulating the amount of Comm 
(commissureless) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Comm polypeptide in contact with the cell, where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Comm in contact with the cell. 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate RoboiComm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function . 
XX 

SQ Sequence 1297 AA; 



Query Match 21.4*; Score 1588; DB 20; Length 1297; 

Best Local Similarity 31.0%; Pred. No. 3.8e-83; 

Matches 421; Conservative 195; Mismatches 527; Indels 216; Gaps 41; 

Qy 55 SPRIIEHPTDLWKKNEPATLNCKVEGKPEPT - IEWFKDGEPVSTNEKK- -SHRVQFKDG 111 

: INI! |:|| : |||||| I | |:|||:|| |::: ||: | 
Db 29 apviiehpidvvvsrgspatlnc--gakpstakitwykdgqpvitnkeqvnshrivldtg 86 

Qy 112 ALFFYRTMQGK • - KEQDGGEYWCVAKNRVGQAVSRHASIiQIAVLRDDFRVEPKDTRVAKG 169 

:l! : II I: I I hill I h I 1 1 : :J : 1 1 : 1 1 1 1 |: : I 
Db 87 slfUkvnsgkngkdsdagayycvasnehgevksnegslklamlredfrvrprtvqalgg 146 

Qy 170 ETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEG 229 

I hill 11:1 111:111 I : I 1 1 1 : 1 |: | | 
m 147 emavlecspprgfpepwswrkdd kelriqdmprytlhsdgnlildpvdrsdsg 200 

Qy 230 NYKCIAQNLVGTRESSYARLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWK 289 

hhl hll I h 1:1 I II I :|lll : ' ! Mil h: II 
Db 201 tyqcvannmvgervsnparlsvfekpkfeqepkdmtvdvgaavlfdcrvtgdpqpqitwk 260 

Qy ' 290 KEEGNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISAKASLIVHAPPNFTK 349 

:: : 1 1 : 1 1 I I : I I : hill III I I I : I I I I llhl 
Db 261 rknepmpvtrayiakdnrglriervqpsdegeyvcyarnpagtleasahlrvqappsfqt 320 

Qy 350 RPSNKRVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPN--SSHGRQYVAADGTLQIT 407 

:h:: II I III: Ihlll hlh h II |: III I 
Db 321 kpadqsvpaggtatfectlvgqpspayfwskegqqdllfpsyvsadgrtkvsptgtltie 380 

Qy 408 DVRQEDEGYYVCSAFSWDSSTVRVFLQVS 437 

:||| III III: : 1:1:: 
Db 381 evrqvdegayvcagmnsagsslskaalkatfetkgrvqkkkskmgkqkqknvqsiikyli 440 

Qy 438 SVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFHDGHAVQ-AGNRY 491 

: :lll h I llll II I llhh! hi MM: :| 
Db 441 savtgntpakppptiehghqnqtlmvgssailpcqasgkptpgiswlrdglpiditdsri 500 

Qy 492 SIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTS-LHRAADPSTYPA 550 

I II : lh hi III I I ll::hl:|ll! I : llll :h 
Db 501 sqhstgslhiadlkkpdtgvytciaknedgestwsasltvedhtsnaqfvrmpdpsnfps 560 

Qy 551 PPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQV 610 



I I ::II: Ill II ::|:|||| I I |: 

Db 561 sptqpiivnvtdtevelhw-napstsgagpitgyiiqyyspdlgqtwfnipdyvastey 618 

Qy 611 TISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASAN— -DLSAARTLLT 666 

I II I I:!::l:l :|| || I :: I : I :: h: I II 

Db 619 rikglkpshspfviraenekgigtpsvssalvttskpaaqvalsdknkmdmaiaekrlt 678 

Qy 667 GKS-VELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYK-DASVPSAQYHSITVMDAS 724 

: ::| :i lh:llll I |: ::| I :: II : I I 

Db 679 seqlikleevktinstavrlfwkkrkl--eelidgyyikwrgpprtndnqy--vnvtsps 734 

Qy 725 AESFWGNLKKYTKYEFFLTPF— FETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQ 781 

::M || : ||||: |: :| I llll || | || ||::::| I I 
Db 735 tenywsnlmpftnyeffvipyhsgvhsihgapsnsmdvltaeappslppedvrirmlnl 794 

Qy 782 TAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSV 841 

I : I I : III l::l II hi I II I :| II I : 
Db 795 ttlriswkapkadgingilkgfqi-vivgqapnnnrnittneraasvtlfhlvtgmtyki 853 

Qy 842 RLNSFTKAGDGPYSKPISLFMD-PTHHVHPPRAHPSGTHDGRHEGQDLTYH--NNGNIPP 898 

I: : : N : h I I : I : I I ::| 

Db 854 rvaarsnggvgvshgtsevimnqdtlekhla aqqenesflyglinkshvp- 903 

Qy 899 GDINPTTHKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTKELGHLSWSD 958 

|;| ; :h; : |: |:: : ::| 

Db 904 ,- vivivailiifvviiiaycywrnsrnsdgkdrsfikind 942 

Qy 959 NEITALNINSKESLW IDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSD- 1011 

: : I , II : : : II : : I I ::| :|| I 
Db 943 gsvhmasnn----lwdvaqnpnqnpmyntagrmtmnnrngqalysltpnaqdffnncddy 998 

Qy 1012 GGT DYAEV—DTRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETCTKTT 1058 

II ' ' lh: ::||| : hhlllll :: 

Db 999 sgtmhrpgsehhyhyaqltggpgnamstfyg-nqyhddpspyatttlv 1045 

Qy 1059 SISADKDSGTHSPYSDAFAGQVPAVPWKSNYLQYPVEPINWSEFLPPPPEHPPPSSTYG 1118 

: I II : hill llll lh 
Db 1046 - lsnqqpa- -wlndkralrapamptn pvppe--pparyad 1079 

Qy 1119 YAQGSPESSRKSSKSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSN 1178 

: I II I I I I || :| :|| : III 
Db 1080 htag- -rrsrssrasdgrg tlngglhhrtsgsqrs dspphtdvsy 1122 

Qy 1179 PLSAVAGGTQNRYQITPTNQHPPQLPAY-FATTGPGGAVPP-NHL-PFATQRHAASEYQA 1235 

: II : : : 1 1 I I III: I hi 
Db 1123 vqlhssdgtgsskertgerrtppnktlmdfippppsnppppgghvydtatrrq 1175 

Qy 1236 GLNAARCAQSRACNSCDALATPSPMQPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQC 1295 

II : ' :l I :l : I h III I : 

Db 1176 -lnrgstpredtyds vsdgafarvdvna---rptsrnrnl--- 1211 

Qy 1296 SSECSDHSRSSQSHKRQLQLEEHGSSARQRGGHHRRRAP 1334 

U I : I ::: I h= I : I 
Db 1212 ggrplkgkrdddsqrsslmmdddggsseadgensegdvp 1250 



RESULT 7 
Y08403 

ID Y08403 standard; Protein; 1297 AA, 
XX 

AC Y08403; 
XX 

DI 24-JUL-1999 (first entry) 
XX 

DE C. elegans ROBO protein. 
XX 

KW R0B01; ROB02; roundabout; nerve guidance; human; murine; cell function 

KW cell morphology; screening assay. 

XX 

OS Caenorhabditis elegans, 
XX 

PN WO9920764-A1. 
XX 
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20-OCT-1998; 98WO-US22164 . 



14-NOV-1997; 97US-0971172 . 
20-OCT-1997; 97OS-0062921. 



(REGC ) CJNIV CALIFORNIA. 

Goodman CS, Kidd I, Mitchell KJ, Tear G; 



cc 

XX 
SQ 



WPI; 1999-312615/26. 
N-PSDB; X57252. 

Robo polypeptides, a new immunoglobulin superfamily member 
Claim 1; Page 59-63; 80pp; English. 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C. elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides. The 
probes and primer? are also useful in screening assays. 

Sequence 1297 AA; 



Query Match 21.41; Score 1588; DB 20; Length 1297; 

Best Local Similarity 31.0%; Pred. No, 3.8e-83; 

Matches 421; Conservative 195; Mismatches 527; Indels 216; Gaps 41; 

Qy 55 SPRIIEHPTDLWKKNEPATLNCKVEGKPEPT- IEWFKDGEPVSTNEKK- -SHRVQFKDG 111 

:| Mill hi : Hill II I |:|||:|| ||::: III: I 
Db 29 apviiehpidwvsrgspatlnc--gakpstakitwykdgqpvitnkeqvnshrivldtg 86 

Qy 112 ALFFYRTMQGK--KEODGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKG 169 

:M : Mhll hill I |: I 1 1 : : 1 : 1 1 : 1 1 1 1 |: : I 
Db 87 slfUkvnsgkngkdsdagayycvasnehgevksnegslklamlredfrvrprtvqalgg 146 

Qy 170 ETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEG 229 

I hill Ihl 111:111 I : I : IN |: I I 
Db 147 emavlecspprgfpepvvswrkdd kelriqdmprytlhsdgnliidpvdrsdsg 200 

Qy 230 NYKCIAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWK 289 

hhl hll I I: hi I II I Mil : I I I I III h: II 
Db 201 tyqcvannmvgervsnparlsvfekpkfeqepkdmtvdvgaavlfdcrvtgdpqpqitwk 260 

290 KEEGNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTK 349 

:: Mhll II : I I : hill III I I I : II I I llhl 
261 rknepmpvtrayiakdnrglriervqpsdegeyvcyarnpagtleasahlrvqappsfqt 320 

Qy 350 RPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPN--SSHGRQYVAADGTLQIT 407 

:|::: II III: Ihlll hlh h II h III I 

Db 321 kpadqsvpaggtatfectlvgqpspayfwskegqqdllfpsyvsadgrtkvsptgtltie 380 

Qy 408 DVRQEDEGYYVCSAFSWDSSTVRVFLQVS 437 

:|ll III III: : II : I: : 
Db 381 evrqvdegayvcagmnsagsslskaalkatfetkgrvqkkkskmgkqkqknvqsiikyli 440 

Qy 438 SVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFHDGHAVQ-AGNRY 491 

: :lll h I Mil II I II I : I : I hi MM: :l 
Db 441 savtgntpakppptiehghqnqtlmvgssailpcqasgkptpgiswlrdglpiditdsri 500 

Qy 492 SIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTS-LHRAADPSTYPA 550 

I II : lh hi III I I 1 1 : : 1 : 1 : 1 1 1 1 I : I III :|: 
Db 501 sqhstgslhiadlkkpdtgvytciaknedgestwsasltvedhtsnaqfvrmpdpsnfps 560 

Qy 551 PPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQV 610 

I I ::lh I : I I : III II ::hllll I I h 

Db 561 sptqpiivnvtdtevelhw-napstsgagpitgyiiqyyspdlgqtwfnipdyvastey 618 



Qy 611 T ISGLTPGTSYVFLVRAENTQG ISVPSGLSNVIKT IEADFDAASAN DLSAARTLLT 666 

I II I lhh:|||| :M II I :: I : I :: h: I II 

Db 619 rikglkpshsymfviraenekgigtpsvssalvttskpaaqvalsdknkmdmaiaekrlt 678 

Qy 667 GKS-VELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYK-DASVPSAQYHSITVMDAS 724 

: ::| : : IhMIII I h ::| I :: II : I I 

Db 679 seqlikleevktinstavrlfwkkrkl--eelidgyyikwrgpprtndnqy--vnvtsps 734 

Qy 725 AESFWGNLKKYTKYEFFLTPF— FETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQ 781 

::l || :| II!: |: :| I III I I I lh:::| I I 
Db 735 tenywsnlmpftnyeffvipyhsgvhsihgapsnsmdvltaeappslppedvrirmlnl 794 

Qy 782 TAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSV 841 

I : I I : III I:: II hi I II I :| II I : 

Db 795 ttlriswkapkadgingilkgfqi-vivgqapnnnrnittneraasvtlfhlvtgmtyki 853 

Qy 842 RLNSFTKAGDGPYSKPISLFMD-PTHHVHPPRAHPSGTHDGRHEGQDLTYH--NNGNIPP 898 

I: : : I ! : h I I : I : I I ::l 

Db 854 rvaarsnggvgvshgtsevimnqdtlekhla aqqenesflyglinkshvp- 903 

Qy 899 GDINPTTHKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTKELGHLSWSD 958 

i |:| : :|:: : |: |:: : ::| 

Db 904 vivivailiifvviiiaycywrnsrnsdgkdrsfikind 942 

Qy 959 NEITALNINSKESLW IDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSD-- 1011 

: : I II : : : II : : I I ::| :|| I 
Db 943 gsvhmasnn-"--lwdvaqnpnqnpmyntagrmtmnnrngqalysltpnaqdffnncddy 998 

Qy 1012 GGT DYAEV--DTRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETCTKTT 1058 

II -I lh: "III : hhlllll :: 

Db 999 sgtmhrpgsehhyhyaqltggpgnamstfyg-nqyhddpspyatttlv 1045 

Qy 1059 SISADKDSGTHSPYSDAFAGQVPAVPWKSNYLQYPVEPINWSEFLPPPPEHPPPSSTYG 1118 

: I II : I: I I I I III lh 
Db 1046 J Isnqqpa- -wlndkralrapamptn pvppe- -pparyad 1079 

Qy 1119 YAQGSPESSRKSSKSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSN 1178 

: I 1.1.1 III II :| :H : II I 

Db 1080 htag-rrsrssrasdgrg tlngglhhrtsgsqrs dspphtdvsy 1122 

Qy 1179 PLSAVAGGTQNRYQITPTNQHPPQLPAY - FATTGPGGAVPP -NHL - PFATQRHAASEYQA 1235 

: II : : I : II | | I h Ihl 
Db 1123 vqlhssdgtgsskertgerrtppnktlmdfippppsnppppgghvydtatrrq 1175 

Qy 1236 GLNAARCAQSRACNSCDALATPSPMQPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQC 1295 

II : : | :| : | |: III | : 

Db 1176 -lnrgstpredtyds vsdgafarvdvna---rptsrnrnl--- 1211 

Qy 1296 SSECSDHSRSSQSBKRQLQLEEHGSSARQRGGHHRRRAP 1334 

I'- I : I ::: I h: I : I 
Db 1212 ggrplkgkrdddsqrsslmntdddggsseadgensegdvp 1250 



RESULT 8 
Y08404 

ID Y08404 standard; Protein; 1649 AA. 
XX f 

AC Y08404; 
XX 

DT 24-JUL-1999 (first entry) 
XX 

DE Human ROBOl protein. 
XX 

KW ROBOl; ROB02; roundabout; nerve guidance; human; murine; cell function; 
KW cell morphology; screening assay. 

XX 

OS Homo sapiens. 
XX 

PN WO9920764-A1. ■ 
XX 

PD 29-APR-1999. 1 
XX 

PF 20-OCM998; S8WO-US22164 . 



Best Available Copy 
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14-NOV-1997; 
20-OCT-1997; 



97US-0971172. 
97US-0062921. 



(REGC ) ONIV CALIFORNIA. 

Goodman CS, Kidd T, Mitchell KJ, Tear G; 



WPI; 1999-312615/26. 
N-PSDB; X08404. 



I 

cc 
cc 

XX 
SQ 



Robo polypeptides, a new immunoglobulin superfamily member 

Claim 1; Page 65-71; 80pp; English. 

I 

This invention describes novel (Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp,, 
C. elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Rotio polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides. The 
probes and primers are also useful in screening assays. 

Sequence 1649 AA; 



Query Match 21.3%; sdore 1584; DB 20; Length 1649; 

Best Local Similarity 30,1%; Pied. No. 9.1e-8- 
Matches 418; Conservative 188; 1 Mismatches 493; Indels 288; Gaps 39; 

Qy 56 priiehptdlwkknepatlnckvegkpeptiewfkdgepvst--ne|rshrvqfkdgal 113 

I llllllll Ihl HIM II I I ::|:|||: |:| 
Db 68 privehpsdlivskgepatlnckaegrptptiewykggervetdkddprshrmllpsgsl 127 

1 

Qy 114 FFYRTMQGRREQ-DGGEYWCVAKMGQAVSRHASLQIAVLRDDFRYEPRDTRVAKGETA 172 

II I : 1:1 : I I I MM MM :|||::|:!lllll j I II II 1 
Db 128 fflrivhgrksrpdegvyvcvarnylgeavshnaslevailrddfrqppsdvmvavgepa 187 

Qy 173 LLECGPPKG I PE PTLIWIKDGVPLDDLKAMS FGASSRVRI VDGGNLlilSNVEP IDEGNYK 232 

Ml Ihl MM: I III MM' |: I II |:1; I I I 

Db 188 vmecqpprghpeptiswkkdgspldd kderiti-rggklmitytrksdagkyv 239 

Qy 233 CIAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLrGQTATFHCSVGGDPPPKVLWKKEE 292 

I: 1:11 III hi I :| |:| I : : Ml jM |:|:: 
Db 240 cvgtnmvgeresevaeltvlerpsfvkrpsnlavtvddsaefkcearidpvptvrwrkdd 299 

•293 GNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASlivHAPPNFTKRPS 352 
MM I I: Ml :| I M I I I I ||: I |:|| II: :| 
Db 300 gelpksryei-rddhtlkirkvtagdmgsytcvaenmvgkaeasatllvqepphfwkpr 358 

Qy 353 NKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMF- ■ -PNSSHGRQfvAADGTLQITDV 409 

:: I I I I hill |::M M M I MM I ||:| 
Db 359 dqvvalgrtvtfqceatgnpqpaifwrregsqnllfsyqppqsssrfsvsqtgdltitnv 418 

Qy |410 RQEDEGYYVCSAFSWDSSTVRVFLQVSSV-DERPPPIIQIGPANQTLPKGSVATLPCRA 468 

:: I 111:1 _:| I : Ml: I MUM II III; I I I 
Db 419 qrsdvgyyicqflnvagsiitkaylevtdviadrpppvirqgpvnqtvavdgtfvlscva 478 

Qy 469 TGNPSPRIKWFHDGHAVQA-GNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAA 527 

Ihl IMIII :| :: h: :| hi III II II Ml 
Db 479 tgspvptilwrkdgvlvstqdsrikqlengvlqiryaklgdtgrytciastpsgeatwsa 538 

Qy 528 TLTVEKPG-STSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQERPGAVGPIIGYT 586 

: h: I I II: h I hi Ml :::| II: I 
Db 539 yievqef gvpvqpprptdpnlipsapskpevtdvsrntvtlsw- - -qpnlnsgatptsyi 595 

Qy 587 VEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTI 646 

:| II MM I II I hlllll I III II M :|| 
Db 596 ieafshasgsswqtvaenvktetsaikglkpnaiylflvraanaygisdpsqisdpvktq 655 

Qy 647 EADFDAASANDLSAARTLLTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYK 706 

: : I : I : I : : :::|:: : I I :|::| :| h 
Db 656 dv-lptsqgvdhkqvqrelgnavlhlhnptvlssssievhwt--vdqqsqyiqgykilyr 712 



Qy 707 DASVPSAQYHS ITVMDASAESFWGNLKKYTRYEFFLTPFFETIEGQPSNSRTA 760 

III I : I h Ml II III :| I I I 

Db 713 ----psganhgesdwlvfevrtpaknsvvipdlrkgvnyeikarpffnefqgadseikfa 768 

Qy 761 LTYEDVPSAPPDNIQIGMY-NQTAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLAN 818 

I h Mil : : Mill III II : lh II : I 

Db 769 ktleeapsappqgvtvskndgngtailvswqpppedtqngmvqeykvwclgnetryhin 827 

Qy 819 MTLNATTTSVLLNNLTTGAVYSVRLNSFTRAGDGPYSKPISLFMDPTHHVHPPRAHPSGT 878 

M : lh: I I III : : I II I hi : :| II 

Db 828 ktvdgstfsvvipflvpgirysvevaastgagsgvksepqfiqld ah---- 874 



Qy 879 
Db 875 



Db 



Db 



■ I PPGDINPTT HKKTTDYLSGP WLMVLVCIVLL 927 

IM I I : :: :l : I |::::| : I 

■gnpvsped-qvslaqqisdwkqpafiagigaacwiilmvfsiwl 918 



928 VLVISAAISMVJFRRKHQ MTRELGHLSWSDNEITAINI 966 

J II : MM :l I III 

919 ■yrhrkkrngltstyagirkvpsftftptvtyqrggeavssggrpgllni 367 

967 NSRESL-WI" DHHRGWRTADTDRDSGLSESRLLSHVNSSQ - - SNYNNS 1010 

: : I : j ::| : M hi I :: : Mil 

968 sepaaqpwladtwpntgnnhndcsiscctagngnsdsnlttysrpadcianynnqldnkq 1G27 



Qy 1011 DGGTDYAEVDTRNLTTFYNCRRSPD NPTPYATTMIIGTS 1C<9 

I I'M I lh llllllh I 

Db 1028 tnMpestvygdvdlsnkinemktfnspnlkdgrfvnpsgqptpyattiqsnlsnnmnn 1087 

Qy 1050 ■ ' SSETCTKTTSISADKDSGTHSP 1071 

:::| I : I I 

Db 1088 gsgdsgekhwkplgqqkqevapvqyniveqnklnkdyrandtvpptipynqsydqntggs 1147 

Qy 1072 YSDAFAGQVPAVPWKSNYLQYPVEP — INWSEFLPPPPEHPPPSSTYGYAQGSPESS 1127 

h : I •) Ml :lh: Mil Ml I 

Db 1148 ynssdrgsstsgsqghkkgartpkvpkqggmnwadllppppahppphs 1195 



Qy 1128 RKSSRSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVY-- 
II: : || :| 

Db 1196 



Qy 1177 



1176 
1233 



SNPLSAVAGGTQNRYQITPTKQH- -PPQLPAYFATTGPGGAVPPNHLP 1222 

hi M: I: M: III II h 
1234 ergptppvrgaassp-aavsyshqstatltpspqeelqpmlqdcpeetg hmq 1284 



Qy 1223 FATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPMQPPPPVPVPEGWYQPVHPNSHP 1282 

I ■ I: Mil II II: 

Db 1285 hqpdrrr-—? qpvspppp-prp---ispphtygyi 1312 

Qy 1283 MHPTSSN 1289 

I h 

Db 1313 sgplvsd 1319 



RESULT 9 
W83927 

ID W83927 standard;' Protein; 753 AA. 
XX 

AC W83927; 

DT 01-MAR-1999 (fi-rst entry) 
XX 

DE Human T85 protein. 



T85; FHMB-6D4; FMHV-SD4; human; neurological disorder; therapy; 
diagnosis. 

Homo sapiens. » 



XX 
OS 
XX 
FH Rey 
FT Peptide 
FT 

FT Protein 



-Location/Qualifiers 
■1. .20 

yiabel- Sig_peptide 
'21.. 753 



Mon Jan 22 13:04:15 2001 



us-09-540-245a-15.rag 



Page 10 



FT 

FT Region 

FT 
FT 

FT Region 

FT 
FT 

FT Region 
FT 

FT Region 

FT 

FT Region 

FT 

FT Region 
FT 

FT Region 
FT 

FT Peptide 
FT 

FT Domain 
FT 



/label- Mat_protein 
525.. 610 

/note- "has homology to a 
domain" 

638.. 727 

/note- "has homology to a 
domain" 

43.. 101 

/note- "has homology to a 
145.. 203 

/note- "has homology to a 
237.. 298 

/note- "has homology to a 
329.. 394 

/note- "has homology to a 
433.. 491 

/note- "has homology to a 
247.. 249 

/note- "RGD motif" 
516.. 600 

/note- "cytokine receptor 
domain" 



fibronectin type III 

fibronectin type III 

Ig superfamily domain 1 
ig superfamily domain 1 
Ig superfamily domain 1 
Ig superfamily domain' 
Ig superfamily domain' 

homology N-terminal 



29-OCT-1998, 

17-APR-1998; 98WO-OS07714. 



10-OCT-1997; 
18-APR-1997; 



97US-0062017. 
97OS-0044746. 



(MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 
Holtzman D, McCarthy SA; 



WPI; 1999-024021/02. 
N-PSDB; V69278. 



New isolated human FTHMA-070 and T85 proteins • used to develop 
products for the diagnosis and therapy of disorders involving 
cellular processes/, e.g. neuronal development. 

Claim 31; Fig 3; 127pp; English. 

This is the amino acid sequence of a novel human protein designated 
T85, and also referred to as FMHB-6D4 and FMHB-SD4. T85 cDNA (see 
V69278) was identified in a human foetal brain cDNA library using a 
screen designed to identify genes encoding proteins having a 
functional signal sequence. T85 nucleic acids and polypeptides of 
the invention are useful as modulating agents in regulating a 
variety of cellular processes. They can be used for identifying 
compounds which bind to or modulate the activity of the polypeptides 
(claimed). They can also be used in screening assays, detection 
assays (e.g. chromosomal mapping, tissue typing, forensic biology), 
predictive medicine (e.g. diagnostic assays, prognostic assays, 
monitoring clinical trials, and pharmacogenomics), and methods of 
treatment (e.g. therapeutic and prophylactic) e.g. for neurological 
disorders. 

Sequence 753 AA; 



Query Match 17.71; Score 1317.5; DB 20; Length 753; 

Best Local Similarity 39.0%; Pred. No. 6.5e-68; 

Matches 284; Conservative 119; Mismatches 288; Indels 37; Gaps 15; 
<r 

Qy 56 PRIIEHPTDLWRKNEPATLNCKVEGKPEPTIEWFRDGEPVST-NEKRSHRVQFRDGAL 113 

111:111:11:1 I llllllll hi HUM II I I :: hi 
Db 29 privehpsdlivskgepatlnckaegrptptiewykggervetdkddprshrmllpsgsl 88 

Qy 114 FFYRTMOGKKEQ-DGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETA 172 

II I : hi : I II llhl :hlll :llh:hllllll II II II I 



89 fflrivhgrksrpdegvyvcvarnylgeavshnaslevailrddfrqnpsdvmvavgepa 148 

173 LLECGPPRGIPEPTLIWIRDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYK 232 

::|| Ihl Mlh I III MM h I II hh I I I 

149 vmecqpprghpeptiswkkdgspldd kderiti-rggklmitytrksdagkyv 200 

233 CIAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWKKEE 292 

h I : I f III hi I :| hi I : : :| I I III I I hh: 
201 cvgtnmvgeresevaeltvlerpsfvkrpsnlavtvddsaefkceargdpvptvrwrkdd 260 

293 GNIPVSRARILHDERSLEISNITPTDEGTWCEAHNNVGQISARASLIVHA- • -PPNFTR 349 

I :| II I h :hl :| I hi I I I lh I hi I Ihl 
261 gelpksryeiTrddhtlkirkvtagdmgsytcvaenmvgkaeasatltvqvgsepphfvv 319 

350 RPSNKKVGLNGVVQLPCMASGNPPPSVFWTKEGVSTLMF- ■ -PNSSHGRQYVAADGTLQI 406 

:| :: I I I I hill hHI =11 hi I I I h I I I 
320 kprdqwalgrtvtfqceatgnpqpaifwrregsqnllfsyqppqsssrfsvsqtgdlti 379 

407 TDVRQEDEGYyVCSAFSWDSSTVRVFLQVSSV-DERPPPIIQIGPANQTLPKGSVATLP 465 

hh: | |||:| : I : :|:h I : 1 1 1 1 : 1 : II III: I 
380 tnvqrsdvgyyicqtlnvagsiitkaylevtdviadrpppvirqgpvnqtvavdgtfvls 439 

466 CRATGNPSPRIKWFHDGHAVQA-GNBYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETS 524 

I llhl Mllll :| :: h: :| hi III II II : 

440 cvatgspvptilwrkdgvlvstqdsrikqlengvlqiryaklgdtgrytciastpsgeat 499 

525 WAATLTVERPG-STSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPII 583 

hi : h: i I lh h I hi :lll :::| I I : 
500 wsayievqef gvpvqpprptdpnlipsapskpevtdvsrntvtlsw- • -qpnlnsgatpt 556 

584 GYTVEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVI 643 

I :| II : I II I II I hlllll I III II :h : 
557 syiieafshasgsswqtvaenvktetsaikglkpnaiylflvraanaygisdpsqisdpv 6IG - 

644 RTIEADFDAASANDLSAARTLLTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRI 703 

II : : ; I : I : I : : :::|:: : I I :|::| :| 

617 ktqdv-lptsqgvdhkqvqrelgnavlhlhnptvlssssievhwt- -vdqqsqyiqgyki 67 3 

704 HYRDASVPSAQYHS IT VMDASAESFWGNLKKYTKYEFFLT PFFET IEGQPSNS 757 

hill I : I I: :hl II III :l I 

674 lyr----psganhgesdwlvfevrtpaknsvvipdlrkgvnyeikarpffnefqgadsei 729 

758 KTALTYED 765 

III: 
730 kfaktlee 737 



RESULT 10 
W42087 

ID W42087 standard;- Protein; 1571 AA. 
XX 

AC W42087; 
XX 

DT 28-SEP-1998 (first entry) 
XX 

DE Human Down syndrome-cell adhesion molecule DS-CAM2. 
XX 

KW DS-CAM2; Dora syndrome-cell adhesion molecule; neural cell; 
KW signal transduction; trisomy 21; mental retardation; 
RW holoprosencephaly; corpus callosum agenesis; 
KW schizencephaly; •'diagnosis; assay; human. 
XX 

OS Homo sapiens. ■ 
XX »': 
PN W09817795-A1. .' 
XX 

PD 30-APR-1998. 1 
XX 

PF 23-OCT-1997; 97WO-OS19547. 
XX 

PR 25-OCT-1996; 96US-0029322. 
XX 

PA (CEDA-) CEDARS SINAI MEDICAL CENT. 
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Korenberg JR; 

WPI; 1998-271791/24. 
N-PSDB; V31988. 



New isolated Down's Syndrome-cell adhesion molecule • used to 
develop products for detection, diagnosis and therapy of 
developmental and neurological abnormalities 

Claim 2; Page 90-95; 109pp; English. 

This polypeptide comprises Down syndrome-cell adhesion molecule 
DS-CAM2, an extracellular soluble protein belonging to a novel 
subclass of the Ig superfamily with highest homology to neural cell 
adhesion molecules. Its amino acid sequence was deduced from cDNA 
clones (see V31982) isolated from a trisomy 21 foetal brain library. 
It is a splice variant of membrane-bound DS-CAM1 (see W42086), and 
lacks the entire transmembrane domain of DS-CAMl . The invention 
provides human and murine DS-CAM nucleic acid sequences (see also 
V31981, V31985-87), expression vectors and host cells, transgenic 
animals, antibodies, antisense oligonucleotides, and primers 
derived from DS-CAM nucleic acids. DS-CAM polypeptides are associated 
with developmental and neurological processes. They can be used in 
e.g. neural prosthetic devices used in entubulation methods of 
repairing (regenerating) damaged or severed peripheral nerves, and 
also in bioassays to identify agonists and antagonists . The products 
can also be used in detection, diagnosis and therapy of developmental 
and neurological abnormalities such as Down syndrome, mental 
retardation, holoprosencephaly, agenesis of the corpus callosum, 
or schizencephaly. 

Sequence 1571 AA; 



Query Match 9.0%; Score 665.5; DB 19; Length 1571; 

Best Local Similarity 23.8%; Pred. No. 5.8e-30; 

Matches 322; Conservative 184; Mismatches 492; Indels 353; Gaps 65; 

Qy 55 SPRIIEHPTDLWKKNEPATLHCKVEGK^ FKDG 111 

Db 406 tpk iisa f sekvvspaepvs lmcnvkgtplpt itwtldddpil - - kggshr lsqmitseg 463 

Oy 112 ALFFYRTMQGKKEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEP-KDTRVAKGE 170 
: I : : :IM I I I I I I I : : I : I |: | 

# 464 nvlsylnisssqvrdggvyrctannsag-wlyqarinv---rgpasirpmknitaiagr 519 

171 TALLECGPPKGIPEPTLIWIKDG-VPLDDLKAMSFGASSRVRIVDGGNLLISNVE-PID 227 
: I I I :: I I: :| : :| : I I :|:|: :| 

Db 520 dtyihc-rvigypyysikwyknsnllpfn hrqvafenngtlklsdvqkevd 569 

Qy 228 EGNYKC - • I AQNLVGTRESSYAKLIVQVKPYF - -MKEPKDQVMLYGQTATFHC • SVGGDP 282 

II I I : I : I :| : : hi h : h : II I I II 
Db 570 egeytcnvlvqpqlstsqsvh--vtvkvppfiqpfefprfsi---gqrvfipcvwsgdl 624 

Qy 283 PPKVLWKK EEGNI PVSRARILHD — EKSLE ISNITPTDEGTYVCEAHNNVGQ I SARASL 339 

I : hi: III:: II III:: I I I I I : :: I 

Db 625 pititwqkdgrpipgslgvtidnidftsslrisnlslmhngnytciarneaaavehqsql 684 

Qy 340 I VHAPPNFTKRPSNKKVGLNG • WQLPCMASGNPPPSVFWT -KEGVSTLMF - PNSSHGRQ 396 

II IN :| :: I: I I I I I I I I:: I ;| | | : :|| 

Db 685 ivrvppkfvvqprdqd-giygkavilncsaegypvptivwkfskgagvpqfqpialngri 743 

Qy 397 YVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVR-VFLQVSSVDERPPPIIQIGPANQT 455 

I ::hl I I :|l Ml: : I : : ::| I : | :| | 
Db 744 qvlsngsllikhweedsgyylckvsndvgadvsksmyltv kipamitsypnttl 798 

Qy 456 LPKGSVATLPCRATGNPSPRIKWFBDGHAVQAG-NRYSIIQG SSLRVDDLQLS 507 

:| : I I I ::| : : || : |:|:: 
Db 799 atqgqkkemsctahgekpiivrwekedriinpemarylvstkevgeevistlqilptvre 858 

Qy 508 DSGTYTCTASGERGETSWAATLTVEKPGSTSLHRAADPSTYPAPPGTPKVLNVSRTSISL 567 

III -I I II lll::l I II :: :| :|:| 



Db 859 dsgffschainsygedrgiiqltvqep-- 



Qy 



— pdppei-eikdvkartitl 903 

568 RWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGD TQVTISGLTPGTSYV 622 

II : It II :| : I :M I II : I ::l 

Db 904 rwtmgfd---gnspitgydie--cknksdsw-dsaqrtkdvspqlnsatiidihpsstys 957 

Qy 623 FLVRAENTQGISVPSGLSNVIRTIEADFDAASANDLSAARTLLTGKSVELIDASAINASA 682 

: hi I 11 II : II II :M II: !:: : 

Db 958 irmyaknrigksep---snel-titad-eaap dgppqe-vhlepissqs 1000 

Qy 683 VRLEWMLHVSADEKY VEGLRIHYKDASVPSAQYHS ITVMDASAES - -FWGNLKK 735 

:|: I l-:h : I :| |:: I :| :| I :| : : II I 

Db 1001 irvtw— -kapkkhlqngiirgyqigyreystggnfqfniisvdtsgdsevytldnlnk 1056 

Qy 736 YTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQH 795 

:hl : Ml Mil ||:|:| : : : |; ; 

Db 1057 ftqyglwqacnragtgpssqeiitttledvpsyppenvqaiatspesisiswstlskea 1116 

Qy 796 HNGNLYGYKIEVSAGNTMKVLANMTLNATTT - - SVLLNNLTTGAVYSVRLNSFT KAGDG P 853 

II I !::: •: II I III |: |: I ||::: :||:|||| 

Db 1117 lngilqgfrv-iywanlmdgelgeiknitttqpsleldglekytnysiqvlaftragdgv 1175 

Qy 854 YSKPISLFMDETHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTTHKKTTDYL 913 

I: I I I I 1:1 
Db 1176 rseqi--ftrtkedvpgp---pag 1194 

Qy 914 SGPWLMVLVCIVLLVLVISAAISMVYFR RKHQMTKELGHLSWSDNE-- 960 

■ I :h III: ||: : : :|:|: I 

Db 1195 vkaaaasasmvfvswlpplklngiirkytvfcshpyptvisefeas 1240 

Qy 961 ITALNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGG 1013 

I hi: 1:1: I I I |:|: 
Db 1241 pdsfsyripnlsrnrqysvwv vavtsagrg nsseii 1276 

Qy 1014 T--DYAEVDTRNLT TFYNCRKSPDNPTPYATTM--IIGTSSSETCTKT 1057 

I h : l II I I: :|:| I II II 

Db 1277 tveplakapariltfsgtvttpwmkdivlpc-kavgdpspavkwmkdsngtpslvtidgr 1335 



Qy 1058 TSISAD- 

II :: 

Db 1336 



KDSGTHSPYSDAFAGQVPAVPWKSNYLQYPVEPINWSE 1102 

:MI :| • : :| II 

ciann nwgsdeiil 1372 

Qy 1103 --FLPPPPEHP PPSSTYGYAQGSPESSRRSSKSAGSGISTNQSILNASIHSS 1152 

: 11:1 11:1 M II I :| 

Db 1373 nlqvqvppdqprltvskttsssitlswlpgd nggssirgyilqysedns 1421 

Qy 1153 SSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPTNQHPP 1201 

II 'I III I : II :::| I I 
Db 1422 eq wgs f p — ispsersyr - - lenlkcgtwykf tltaqngvgpgr iseiieakt 1470 

! 

Qy 1202 QLPAYFATT GPGG — AVPPNHLPFAT QR-— 1227 

III II : : III II 

Db 1471 lgkepqfskeqelfasinttrvrlnligwndggcpitsftleyrpfgttvwttaqrtsls 1530 

Qy 1228 HAASEYQAGL- - -NAARCAQSRA 1247 

I h I: : 1:1 II: :| 
Db 1531 ksyilydlheitwyelqmrvcnsagcaekqa 1561 



RESULT 11 
W42086 

ID W42086 standard; Protein; 1910 AA, 

XX ' 

AC W42086; 

XX 1 

DT 28-SEP-1998 (first entry) 

XX f 

DE Human Down syndrome-cell adhesion molecule DS-CAMl. 
XX 

KW DS-CAMl; Down syndrome-cell adhesion molecule; neural cell; 

KW signal transduction; trisomy 21; mental retardation; 

KW holoprosencephaly; corpus callosum agenesis; 

KW schizencephaly;' diagnosis; assay; human, 
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xx 

OS Homo sapiens. 

XX 

FH Key 
FT Peptide 

FT 

FT Protein 
FT 

FT Domain 

FT 

FT 

FT Domain 

FT 

FT 

FT Domain 

FT 

FT Domain 

FT 

FT Region 
FT 

FT Region 

H Region 
tT 

FT Region 

FT 

FT Region 
FT 

FT Region 

FT » 

FT Region 
FT 

FT Region 
FT 

FT Region 
FT 

FT Disulfide-bond 

FT Disulfide-bond 

FT Disulfide-bond 

FT Disulfide-bond 

FT Disulfide-bond 

FT Disulfide-bond 

FT Disulfide-bond 

FT Disulfide-bond 

FT Disulfide-bond 

FT Disulfide-bond 

FT Modified-site 
FT 

FT Modified-site 

£ Modified-site 

FT Modified-site 
FT 

FT Modified-site 
FT 

FT Modified-site 

FT 

FT Modified-site 

FT 

ft Modified-site 
FT 

ft Modified-site 
FT 

FT Modified-site 
FT 

FT Modified-site 

FT 

FT Modified-site 
FT 

ft Modified-site 

FT 

ft Modified-site 



Location/Qualifiers 
1..23 

/label- Sig_peptide 
24.. 1910 

/label- Mat_protein 
24.. 887 
/label- IG 

/note- "immunoglobulin type-C2 domain" 
888.. 1594 
/label- FbN 

/note- "fibronectin type III domain" 
1595.. 1616 

/label- Transmembrane 

1617.. 1910 

/label- Cytoplasmic 

24. .125 

/label- Igl 

127.. 225 

/label- lg2 

226.. 316 

/label- Ig3 

317.. 409 

/label- ig4 

410.. 506 

/label- Ig5 

507.. 603 

/label- Ig6 

6$.. 697 

/label- Ig7 

698. .792 

/label- lg8 

793.. 887 

/label- Ig9 

46. .102 

145. .197 

246. .293 

335. .385 

428.. 484 

525.. 575 

617. .669 

711.. 766 

809.. 865 

1307.. 1359 

78.. 80 

/note- "Asn is N-glycosylated", 
106.. 108 

/note- "Asn is N-glycosylated" 
470. ,472 

/note- "Asn is N-glycosylated" 
487. .489 

/note- "Asn is N-glycosylated" 
658,. 660 

/note- "Asn is N-glycosylated" 
666.. 668 

/note- "Asn is N-glycosylated" 
710. .712 

/note- "Asn is N-glycosylated" 
748.. 750 

/note- "Asn is N-glycosylated" 
795. .797 

/note- "Asn is N-glycosylated" 
924.. 926 

/note- "Asn is N-glycosylated" 
1142.. 1144 

/note- "Asn is N-glycosylated" 
1160.. 1162 

/note- "Asn is N-glycosylated" 
1250.. 1252 

/note- "Asn is N-glycosylated" 
1271.. 1273 



,/note- "Asn is N-glycosylated" 
Modified-site '1324.. 1326 

;/note- "Asn is N-glycosylated" 
Modified-site a341. .1343 

./note- "Asn is N-glycosylated" 
Modified-site .1488,. 1490 

i/note- "Asn is N-glycosylated" 

W09817795-A1. i 

30-APR-1998. • 

23-OCT-1997; 97WO-US19547 . 

25-OCT-1996; 96OS-0029322. 

(CEDA-) CEDARS SINAI MEDICAL CENT. 

Korenberg JR; 

WPI; 1998-271791/24. 
N-PSDB; V31981.; 



New isolated Down's Syndrome-cell adhesion molecule - used to 
develop products for detection, diagnosis and therapy of 
developmental and neurological abnormalities 

Claim 2; Page 73-78; 109pp; English. 

This polypeptide comprises Down syndrome-cell adhesion molecule 
DS-CAMl, a cell: surface glycoprotein belonging to a novel subclass 
of the Ig superfamily with highest homology to neural cell adhesion 
molecules . Its amino acid sequence was deduced from cDNA clones 
(see V31981) isolated from a trisomy 21 foetal brain library. A 
splice variant, DS-CAM2 (see W42087), which is non-membrane bound 
was also identified. The invention also provides human and murine 
DS-CAM nucleic acid sequences (see also V31985-88), expression 
vectors and host cells, transgenic animals, antibodies, antisense 
oligonucleotides, and primers derived from DS-CAM nucleic acid. 
DS-CAM polypeptides are associated with developmental and 
neurological processes, They can be used in e.g. neural prosthetic 
devices used in entubulation methods of repairing (regenerating) 
damaged or severed peripheral nerves, and also in bioassays to 
identify agonists and antagonists, The products can also be 
used in detection, diagnosis and therapy of developmental and 
neurological abnormalities such as Down syndrome, mental 
retardation, holoprosencephaly, agenesis of the corpus callosum, 
or schizencephaly. 

Sequence 1910 AA; 



Query Match 8.9%; Score 661; DB 19; Length 1910; 

Best Local Similarity 24.24; Pred. No. 1.4e-29; 

Matches 305; Conservative 175; Mismatches 473; Indels 308; Gaps 

Qy 55 SPRIIEHPTDLWKKNEPATLNCKVEGKPEPTIEWFKDGEPVSTNEKKSHRVQ - - - FKDG 111 

:|:H II II :| I hi I III I I :|: : III: :l 
Db 406 tpkiisafsekvvspaepvslmcnvkgtplptitwtldddpil--kggshrisqmitseg 463 

Qy 112 ALFFYRTMQGRKEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEP-KDTRVAKGE 170 

: I : ■: :|ll II I I I I I : : I : I h I 
Db 464 nwsylnisssqvrdggvyrctannsag-wlyqarinv---rgpasirpmknitaiagr 519 

Qy 171 TALLECGPPKGIPEPTLIWIKDG-VPLDDLKAMSFGASSRVRIVDGGNLLISNVE-PID 227 

: I I I :: I I: :| : :| : I I :|:|: :| 

Db 520 dtyihc-rvigypyysikwyknsnllpfn hrqvafenngtlklsdvqkevd 569 

Qy 228 EGNYKC- • IAQNLVGTRESSYAKLIVQVKPYF - -MKEPKDQVMLYGQTATFHC - SVGGDP 282 

I M : I : I :| : : !:| I: : |: : II II II 
Db 570 egeytcnvlvqpqlstsqsvh- -vtvkvppf iqpf ef prf si- - -gqrvf ipcvwsgdl 624 

Qy 283 PPKVLWKKEEC-NIPVSRARILHD— EKSLEISNITPTDEGTrVCEAHNNVGQISARASL 339 



Mon Jan 22 13:04:15 2001 



us-09-540-245a-15.rag 



Page 13 



Db 



I : hi: III:: II III:: II II I : :: I 

625 pititwqkdgrpipgslgvtidnidftsslrisnlslrahngnytclarneaaavehqsql 684 

340 IVHAPPNFTRRPSNRRVGLNG - WQLPCMASGNPPPSVFWT -REGVSTLMF - PNSSHGRQ 396 

II III :| :: hi I I I I I I I:: L :| I I : :ll 

685 ivrvppkfvvqprdqd-giygkavilncsaegypvptiwkfskgagvpqfqpialngri 743 



Qy 397 YVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVR-VFLQVSSVDERPPPIIQIGPANQT 455 

I -hi I I Ml l!|:| : I : : ::| I : I :| I 

Db 744 qvlsngsllikhweedsgyylckvsndvgadvsksmyltv kipamitsypnttl 798 

Qy 456 LPKGSVAT LPCRATGNPS PRIKWFHDGHAVQAG - NRYSI IQG SSLRVDDLQLS 507 

:| : I I I ::| : : II : |:|:: 
Db 799 atqgqkkemsctahgekpiivrwekedriinpemarylvstkevgeevistlqilptvre 858 

Qy 508 DSGTYTCTASGERGETSWAATLTVEKPGSTSLHRAADPSTYPAPPGTPRVLNVSRTSISL 567 
III ::| I II IN::! I II :: :| :|:| 

•859 dsgffschainsygedrgiiqltvqep ---pdppei-eikdvkartitl 903 
568 RWAKSQERPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGD TQVTISGLTPGTSYV 622 

II : II II :| : I :| I I II : I ::| 

Db 904 rwtmgfd---gnspitgydie--cknksdsw-dsaqrtkdvspqlnsatiidihpsstys 957 

Qy 623 FLVRAENTQGISVPSGLSNVIRTIEADFDAASANDLSAMLLTGRSVELIDASAINASA 682 

: 1:1 I I I II : II II HI II: I:: : 

Db 958 irmyaknrigksep- - -snel - titad-eaap dgppqe-vhlepissqs 1000 

Qy 683 VRLEWMLHVSADEKY VEGLRIHYKDASVPSAQYHS ITVMDASAES - - FWGNLRK 735 

, :|: I I :|: : I :| h: I :| :| I :| : : II I 
Db 1001 lrvtw----kapkkhlqngiirgyqigyreystggnfqfniisvdtsgdsevytldnlnk 1056 

Qy 736 YTKYEFFLTPFFETIEGQPSNSRTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQH 795 

:|:| : III Mill 1 1 : 1 : ! : : : |: : 
Db 1057 ftqyglvvqacnragtgpssqeiitttledvpsyppenvqaiatspesisiswstlskea 1116 

Qy 796 HNGNLYGYKIEVSAGNTMKVLANMTLNATTT--SVLLNNLTTGAVYSVRLNSFTKAGDGP 853 

II I I::: : I I I III I: I: I II::: :ll:llll 

Db 1117 lngilqgf rv - iywanlmdgelgeiknitttqpsleldglekytnysiqvlaf tragdgv 1175 

Qy 854 YSKPISLFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTTHRKTTDYL 913 

I: I I I I 1:1 
Db 1176 rseqi - - f trtkedvpgp- - -pag 1194 

Qy 914 SGPWLMVLVCIVLLVLVISAAISMVYFK RRHQMTRELGHLSWSDNE- - 960 

I :|: III: II: : : :|:|: I 

1195 vkaaaasasmvfvswlpplklngiirkytvfcshpyptvisefeas 1240 

™ 961 ITALNINSRESLWIDHHRGWRTADTDRDSGLSESKLLSHVNSSQSNYNNSDGG 1013 

I I: I : hi: I I I |:|: 

Db 1241 pdsfsyripnlsmrqysvwv vavtsagrg nsseii 1276 

Qy 1014 T--DYAEVDTRNLT TFYNCRRSPDNPTPYATTM--IIGTSSSETCTRT 1057 

I I: I II II: :hl I II I I 

Db 1277 tveplakapariltfsgtvttpwmkdivlpc-kavgdpspavkwinkdsngtpslvtidgr 1335 

Qy 1058 TS1SAD KDSGTHSPYSDAFAGQVPAVPWRSNYLQYPVEPINWSE 1102 

II :: :IH :| : :| II 

Db 1336 rsifsngsf iirtvkaedsgyys ciann nwgsdeiil 1372 

Qy 1103 --FLPPPPEHP PPSSTYGYAQGSPESSRRSSRSAGSGISTNQSILNASIHSS 1152 

: II: I I I : I Mill: 

Db 1373 nlqvqvppdqprltvskttsssitlswlpgd nggssirgyilqysedns 1421 

» 

Qy 1153 SSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPTNQHPPQLPAYFATTGP 1212 

II f I I I I : II :: :| I II 

Db 1422 eq wgsfp — ispsersyr--lenlkcgtwykftltaqn gvgp 1459 

Qy 1213 G 1213 
I 

Db 1460 g 1460 



R13144 

ID R13144 standard; Protein; 1728 AA. 

AC R13144; 
XX 

DT 04-OCM991 (first entry) 



Deleted in Colorectal Carcinomas. 

DCC gene; cancer; diagnosis; antibodies; tumor igenes is; neoplasm. 
Homo sapiens. ■ 



DE 

XX 
RW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
FT 
FT 
FT 
XX 

PN WO9109964-A. 
XX 
PD 
XX 
PF 
XX 



Peptide 



■ 'Location/Qualifiers 
'202.. 1648 
/label- DCC 

202. .227 

■ /label- sig_peptide 
228.. 1648 

/label- mat_protein 



11-JOW991. • 
19-DEC-1990; 90WO-US07314. 



04-JAN-1990; 90OS-0460981. 
(UYJO ) JOHNS HOPKINS UNIV. 



Vogelstein B; 



WPI; 1991-222913/30. 
N-PSDB; Q12752. 



Human DCC gene, -deleted in colorectal carcinoma - and diagnosis 
or prognosis of, neoplasms by detecting loss of gene function or 
expression prods., or mutation (s) 

Claim 44; Page 31; 51pp; English. 

Cells transformed with the wild-type DCC gene can be used as model 
systems to study cancer remission and drug therapy, DCC polypeptide 
expression prods, may be used to reverse the neoplastic state. 
X1615 represent an amino acid illegible in the specification, all 
other Xs are encoded by stop codons. 
See also Q12752'-55. 



Sequence 1728; AA; 



Query Match i 8.6%; Score 639; DB 12; Length 1728; 

Best Local Similarity 23.11; Pred. No. 2.2e-28; 

Matches 361; Conservative 170; Mismatches 565; Indels 464; Gaps 70; 

Qy 57 RI IEHPTDLV VRKNEPATLNCRVEG ■ KPEPT IEWFKDG • • EPVSTNERKSHRVQFRDGAL 113 

I : hi I' : |:| I : I hi III : :|:| I :hl 
Db 242 rf lsepsdavtmrggnvlldcsaesdrgvpvikwkkdgihlalgmderkq- - -qlsngsl 298 

Qy 114 FFYRTMQGRREQ-DGGEYWCVAR-NRVGQAVSRHASLQIA-VLRDDFRVEPKDTRVAKGE 170 

: : •: I I I I I I : 1 1 I : : I 1 1 I : : h 
Db 299 liqnilhsrhhkpdeglyqceaslgdsgsiisrtakvavagplr--flsqtesvtafmgd 356 

Qy 171 TALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGN 230 

I M: <| 111:11: I : III :: I I II ::l I I 

Db '357 tvllkc-evi'gepmptihwqknqqdltpip gdsrvwlpsgalqisrlqpgdigi 410 

Qy 231 YRCIAQNLVGTRESSYARLIVQVRP YFMREPKDQVMLYGQTATFHCSVGGDPPP 284 

|:| |:| ?| : |:: : I II:: I : I : I: I I I I III 
Db 411 yrcsarnpassrtgneaevrilsdpglhrqlyflqrpsnvvaiegkdavleccvsgyppp 470 

Qy 285 KVLWKREEGNIPV-SRARILHDERSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHA 343 
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I : I I : I; :| |||:| MM MM 
471 sftwlrgeeviqlrskkysllggsnlUsnvtdddsgmytcwtyknenisasaeltvlv 530 

344 PPNFTKRPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGT 403 

II I III ::: I II I 1:1 I I I :M: : 
531 ppwflnhpsnlyayesmdiefectvsgkpvptvnwmkng--dvvipsdyf---qivggsn 585 

404 LQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPMQTLPKGSVAT 463 

1:1 I : llhl I I : :: I I M MM 

586 lrilgvvksdegfyqcvaeneagnaqtsaqliv pkpaipsssvlpsaprdwpv 639 

464 L PCRATGNPSPRIKWFHDGHAVQAGNRYS HQ — GS - SLRVDDLQL 506 

I I I II I: I : : II ': II I I :|: 
640 lvssrfvrlswrppaeakgn- -- -iqtftvf fsregdnreralnttqpgslqltvgnlkp 695 

507 SDSGT YTCTASGERGETSWAATLTVEKPGSTS - • LHRAADPS • TYPAPPGTPKVLNVSRT 563 

1:111 II :| : I I II : M I 

696 eamytfrwaynewg pgessqpikvatqpelqvpgpvenlqavstspt 743 

564 SISLRWAKSQEKPG-AVGPIIGYTVEYFSPDLQTG 597 

11:1 I I I II: II : I :: II 
744 silitw— -eppayangpvqgyrl ■ -f ctevstgkeqnievdglsykleglkkf teysl 797 

598 WIVAAHRVGD TQVTIS 613 

Ml | ||:| 

Db 798 rflaynrygpgvstdditwtlsdvpsappqnvslevvnsrsikvswlpppsgtqngflt 857 

Qy 614 GLTPGTSYVFLVRAENTQGISVPSGLSNVIKT 645 

II I: I I I I I II :| 
Db 858 gykirhrkttrrgemetlepnnlwylftglekgsqysfqvsamtvngtgppsnwyta-et 916 

Qy 646 IEADFDAASANDLSAA RTLLTGKSV ELID 674 

I I I : I :: I : M :| 

Db 917 pendldesqvpdqpsslhvrpqtnciimswtpplnpniwrgyiigygvgspyaetvrvd 976 

Qy 675 ASAINASAVRLEWMLHV SADEKYV 698 

: I III I II : : 

Db 977 skqryyslerlessshyvislkafnnagegvplyesattrsitdptdpvdyypllddfpt 1036 

Qy 699 EGLRIHYKDASVP SAQY 715 

::|::MII M 
Db 1037 wvpdlstpralppvgvqavalthdavrvswadnsvpknqktsevrlytvrwrtsfsasaky 1096 

' Qy 716 HSITVMDASAESFWGNLKKYTKYEFFLTPFFETIEGQPSNSKIALTYEDVP-SAPPDNI 774 

I I - I: II I III = III III I III I 

Db 1097 ks— edttslsytatglkpntrayefsvmvtkDrrsstwsrasahattyeaaptsapkdft 1153 

Qy 775 QIGMYNQ-TAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMK VLANMTLNATTTSVL 829 

I : I I I II II : I : : : :: :: : I :: 

1. 1154 vitregkpravivswqpp-leangkitayilfytldknipiddwimetisgdrlthqim 1211 
830 LNNLTTGAVYSVRLNSFTKAGDGPYSKPISLFMDPTHHV-HPPR-AHPSGTH-DGRHEGQ 886 

II I :| I: : I II I II II I M h I I II 

Db 1212 <31nldt--myyfriqamskgvgplsdpi-lfr--tlkvehpdkmandqgrhgdggywpv 1266 

Qy 887 D—LTYHNNGNIPP-GDINPTTHKKTTDYLSGPWLMVLV — CIVLLVLVISAAI — 935 

II : III I ::| I I : |:::| I MM I I 

Db 1267 dtnlidrstlneppigqmhp-phgsvtpqknsnllviiwtvgvltvlvwivavictrr 1325 

Qy 936 SMVYFKRK— HQMTKELGHLSWSDNEITALNINSKESLWIDHHRGWRTADTDKDSG- 990 

Mill I :: III || : :| I 

Db 1326 ssaqqrkkrathsagkrkg sqkdl rppdlwi-hheememkniekpsgtd 1373 

Qy 991 LSESKLLSHVNSSQSNYNNSDGGTDYAEVDTRNLTTFYNCRKSPDNPTPYATT 1043 

: MM: III | :: 
Db 1374 pagrdspiqscqdltpvshsqsetqlgskstshsgqdtee 1413 

Qy 1044 MI IGTSSSETCTKTTSISADK DSGTHSPYSDAFAGQVPAVPWKSNYLQY 1093 

: I I h:l : I: :::| I |:|| II 

Db 1^414 agssmstlerslaarraprrklmipmdaqsnnp awsaipvptlesaqy 1462 



1094 PVEPINWSEFLPPPP-EHPPPSSTY- • 

I IIIMII 



--GYAQGSPESSRKSSKSAGSGIST 1140 
I: I hi I :l 



Db 1463 t 



■-gilpsptcgyphpqftlrpvpfptlsvdrgfgag rsqsvsegptt 1508 



Qy 1141 NQSIL-- -NASIHSSSSGGFSAWGVSPQYAVAC — PPENVYSNPLSAVAGGTQNRYQI 1193 

I : : llll I II I MM 

Db 1509 qqppralppsqpehsssee apsrtiptacvrpthplrsfanpllp 1552 

Qy 1194 TPTNQHPPQLPAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDA 1253 

I : I:: 1 . : II :| |: III hi 

Db 1553 ppmsalepkvpytpllsqpgptlpkthvk taslglagkars 1593 

Qy 1254 LATPSPMQPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSECSDHSRSSQSHKRQL 1313 

I II II I II II : M : |: I : M 

Db 1594 plipvsvpta"pevseesh--kptxdsanvye-qddlseqmasleglmkql 1640 



RESULT 13 
R68553 

ID R68553 standard! Protein; 1447 AA. 
H 

AC R68553; 
XX 
DT 
XX 



05-JUL-1995 (first entry) 

Deleted in colorectal carcinoma (DCC) . 

Tumour suppressor; deleted in colorectal carcinoma; antibody; 
cancer diagnosis. 



Key Location/Qualifiers 
Misc-difference 1. .1063 

'/note- "DCC epitope" 
Misc-difference' 3369.. 4341 

:/note- "DCC epitope" 
Misc-difference 26.. 1126 

7note- "DCC epitope on extracellular domain" 
Misc-difference 1123,. 1447 

./note- "DCC epitope on intracellular domain" 



18 -MAY- 1994; 94WO-US05277. 



26-MAY-1993; 930S-Q068950. 



PA (DYJO ) DNIV JOHNS HOPKINS. 



Bruskin A, Jarosz DE, 
Zabrecky JR; 



DPI; 1995-022830/03, 
P-PSDB; Q80196,: 



Johnson K, Kinzler KW, Vogelstein B; 



Antibodies specific for tumour suppressor gene product, DCC • 
useful for detecting expression of DCC gene, for cancer diagnosis. 

Claim 4; Page 24-28; 39pp; English. 

The protein represents the DCC tumor suppressor, and epitopes are 
identified which are used used in the generation of polyclonal or 
or monoclonal antibodies against DCC. The antibodies can detect 
DCC protein in biological samples (including tumour tissue, 
peripheral blood mononuclear cells or a tumour biopsy lysate) 
despite low levels of DCC expession, and are therefore useful in 
cancer, especially colorectal carcinoma, diagnosis. 

Sequence 1447 AA; 



Query Match 



8.6%; Score 637,5; DB 16; Length 1447; 
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Best Local Similarity 22.9%; Pre! No. 2.1e-28; 

Matches 271; Conservative 150; Mismatches 444; Indels 317; Gaps 43; 

Qy 57 RII^PTDLWRKHEPATLNCKVEG-KPEPTIEWFKDG-EPVSTraSHRVQFKDGAL 113 

I '< hi I : 1:1 I : I hi III : :|:| I :|:| 
Db 41 rf lsfepsdavtmrggnvlldcsaesdrgvpvikwkkdgihlalgmderkq- - -qlsngsl 97 

Qy 114 FFYRTMQGKKEQ-DGGEYWCVAK-NRVGQAVSRHASLQIA-VLRDDFRVEPKDTRVAKGE 170 

: : : I I I I I I : 1 1 I : : I 1 1 I : : |: 
Db 98 1 iqnilhsrhhkpdeglyqceas lgdsgs iisrtakvavagplr - - f lsqtesvtaf mgd 155 

Qy 171 TALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGN 230 

I M: I Ml: I I: I : III :: II II ::| I I 

Db 156 tvllkc-evigepmptihwqknqqdltpip gdsrmlpsgalqisrlqpgdigi 209 



Qy 



t 



Db 



231 YKCIAQNLVGTRESSYAKLIVQVKP YFMKEPKDQVMLYGQTATFHCSVGGDPPP 284 

; 1:1 1:1 :l : h: : I IM I : I : 1= I I I I III 
210 yrcsarnpassrtgneaevrilsdpglhrqlyflqrpsnvvaiegkdavleccvsgyppp 269 

285 KVLWKKEEGNIPV-SRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHA 343 

I : I I : I: I :l MM I I I I III M 

270 sftwlrgeeviqlrskkysllggsnllisnvtdddsgmytcwtyknenisasaeltvlv 329 

344 PPNFTKRPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGT 403 

II I III : I : 
330 ppwflnhpsn lyayesi 346 

404 LQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVAT 463 

I: I 

347 — -diefe 351 



Qy 464 LPCRATGNPSPRIKWFHDGHAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGET 523 

, I :| I I : I :| I : : |: l|:||: : II I I I I II 
Db 352 "CtvsgkpvptvnwmkngdvvipsdyfqivggsnlrilgwksdegfyqcvaeDeagna 409 

Qy 524 SWAATLTVEKPGSTSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQERPGAVGPII 583 

:| I I II I I: I : II : I I I I I I 

Db 410 qtsaqlivpkpaips ssvlpsaprdwpvlvssrfvrlswrppae---akgniq 460 

Qy 584 GYTVEYFSPD LQTGWI VAAHRVGDTQVT I SGLT PGTS YVFLVRAENTQG I SVP 636 

:M :H : II : I hi: II Mill I I 
Db 461 tftv-ffsregdnreralnt tqpgslqltvgnlkpeamytfrwaynewg—p 510 

Qy 637 SGLSNVIKTIEADFDAASANDLSAARTLLTGKSVELIDASAINASAVRLEWMLHVSADEK 696 
II :: I II : I :::::: I |: 



511 gessqpik-- 



■ -vatqpelqvpgpvenlqavstsptsilitweppayangp 557 



697 YVEGLRIHYKDASVPSAQYHSITVMDASAESFWGNLKKYTKYEFFLTPFFETIEGQPSN 756 

1:1 I: : I I :: I: : IIIMI : I :: 
558 -vqgyrlfctevstgkeqn ievdglsykleglkkfteyslrflaynrygpgvstd 611 

757 SKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYKI— EVSAGNTM 813 

I :l lllllll I: : M : II Ml II : Ml : : I 
612 ditvvtlsdvpsappqnvslewnsrsikvswlpppsgtqngfitgykirhrkttrrgem 671 

814 KVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISLFMDPTHHVHPPRA 873 

: II I I I: II :::: I I II 
672 e tl eptinlwy 1 f tglekg sqysf qvsamtvngtgp 706 

874 HPSGTHDGRHEGQDLTYHNNGNIP--PGD--INPTTHKKTTDYLSGPWLMVLVCIVLLVL 929 

II : II : :| I : I |: II:: 
707 -psnwytaetpendl- - -desqvpdqpsslhvrpqtn dim— 743 



Qy .930 VISAAISMVYFKRKHQMTKELGHLSWSDSEITALNINS--KESLWIDHHRGWRTADTDR 987 

I I : ::| I : I |:: :| : : : : : 
Db 744 swtppl - npniwrgy i igygvgspyaetvrvdskqryys ierle 787 

Qy 988 DSGLSESKLLSHVNSSQSNYNNSDGGTD-YAEVDTRNLTTFYNCRKSPDNPTPYATTMII 1046 
I II I :lh I I l|::| I :| I 

Db 788 ss shyvislkafnnagegvplyesattrsit dptdpvdy 826 



Qy 1047 GTSSSETCTKTTSISADKDSGTHSPYSDAFAGQVPAV-- 

III IM 



--PV-VKSNYLQYPVEPIN 1099 
II h: I : :: 



Db 827 • 



■ -ypllddf ptsvpdlstpralppvgvqavalthdavrvs i 



Qy 1100 WSEFLPPPPEHPPPSSTYGYAQGSPESSRKSSKSAGSGISTNQSILNASIHSSSSGGFSA 1159 

I:: I :: I : |: II ::| ::| 

Db 864 wadnsvpknqktsevrlytvrwrtsfsasakyks edttslsyta 907 

Qy 1160 WGVSP — QYAVACPPENVYSNPLSAVAGGTQNRYQITPTN 1197 

I: I ::::| ill: M I I: 11 = 
Db 908 tglkpntmye.fsvmv-tkarrsstwsmtahat--tyeaapts 946 



Y33498 

ID Y33498 standard; Protein; 1447 AA, 
XX 
AC 
XX 
DT 
XX 
DE 
XX 



Y33498; 

19-JAN-200Q (first entry) 

Human DCC protein. 

Proapoptotic; dependence domain; p75NTR; androgen receptor; DCC; 
huntingtin polypeptide; Machado-Joseph disease; SCA1; SCA2; SCA6; 
atrophin-1; cell death; apoptosis; Huntington's disease; head trauma; 
Alzheimer's disease; Kennedy's disease; spinocerebellar ataxia; stroke; 
dentatorubropal 1 idoluy s ian atrophy; cell proliferation; cell survival; 
neoplastic; malignant; autoimmune; fibrotic. 



PN W09945944-A1. 



16-SEP-1999. ; 

11- MAR-1999; 99WO-OS05250. 

12- MAR-1998; 98US-0041886. 
(BORN-) BORNHAM'INST. 
Bredesen DE, Rabizadeh S; 



WPI; 1999-561617/47. 
N-PSDB; Z23431. 



New proapoptotic dependence peptides, used to develop products for 
treating, e.g. Alzheimer's disease • 

Disclosure; Page 164-168; 199pp; English. 

This invention describes novel pure proapoptotic dependence peptides 
which comprise a sequence of an active dependence domain selected from 
dependence polypeptides consisting of p75NTR, androgen receptor, DCC, 
huntingtin polypeptide, Machado-Joseph disease gene product, SCA1, SCA2, 
SCA6 and atrophin-1 polypeptide. The proapoptotic peptides are capable 
of inducing cell death and can be used to develop products to mediate or 
inhibit apoptosis. The methods can be used for reducing the severity of 
a proapoptotic dependence domain mediated pathological conditions e.g. 
Huntington's disease, Alzheimer's disease, Kennedy's disease, 
Spinocerebellar- ataxias, dentatorubropallidoluysian atrophy, 
Machado-Joseph disease, stroke or head trauma. They can also be used for 
reducing the severity of a pathological condition mediated by upregulated 
cell proliferation or cell survival e.g. neoplastic, malignant, 
autoimmune or fibrotic conditions. This sequence represents the human 
DCC (deleted in' colonic cancer) polypeptide described in the method of 
the invention. ' 

Sequence 1447 ; AA; 



Query Match • 8.6*; Score 637.5; DB 20; Length 1447; 

Best Local Similarity 22.94; Pred. No. 2.1e-28; 

Matches 271; Conservative 150; Mismatches 444; Indels 317; 



Mon Jan 22 13:04:15 2001 



us-09-540-245a-15.rag 



Page 16 



Qy 57 RIIEHPTDLWKKNEPATLNCKVEG-KPEPTIEWFKDG--EPVSTNEKKSHRVQFKDGAL 113 

I : 1:1 I : |:| I : I hi III : :|:| I :|:| 

Db 41 rflsepsdavtmrggnvlldcsaesdrgvpvikvkkdgihlalgmderkq---qlsngsl 97 

Qy 114 FFYRTMQGKKEQ-XGEYWCVAK-NSVGQAVSRHASLQIA-VLRDDFRVEPKDTRVAKGE 170 

: : : I I I I I I : 1 1 I : : I 1 1 I : : h 

Db 98 liqnilhsrhhkpdeglyqceaslgdsgsiisrtakvavagplr--flsqtesvtafmgd 155 

Qy 171 TALLECGPPKGIPEPTLIWIKI1GVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGN 230 

I 11:1 I I II: I I: I : III :: I I II ::| I I 

Db 156 tvllkc-evigepmptihwqknqqdltpip gdsrvwlpsgalqisrlqpgdigi 209 

Qy 231 YKC I AQNLVGTRESSY AKLIVQVKP YFMKEPKDQVMLYGQTATFHCSVGGDPPP 284 

hi hi :| : h: : I lh: I : I : h I I I I III 

Db 210 yrcsarnpassrtgneaevrilsdpglhrqlyflqrpsnwaiegkdavleccvsgyppp 269 

Qy 285 KVLWKKEEGNIPV-SRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHA 343 

I : I I : h I :| llhl I I I I III I I I 

Db 270 sftwlrgeeviqlrskkysllggsnllisnvtdddsgraytcwtyknenisasaeltvlv 329 

1 344 PPNFTKRPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGT 403 
\ II I III ! : I : 

338 ppwflnhpsn j-lyayesni 346 

Qy 404 LQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVAT 463 

I: I 

Db 347 — diefe 351 

Qy 464 LPCRATGNPSPRIKWFHDGHAVQAGMSIIQGSSLRVDDLQLSDSGTYTCTASGERGET 523 

I :l I I : I :| I : : h 1 1 : 1 1 : : II I I I I II 

Db 352 --ctvsgkpvptvnwmkngdvvipsdyfqivggsnlrilgvvksdegfyqcvaeneagna 409 

Qy 524 SWAATLTVEKPGSTSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPII 583 

:| I I II I I h I : II : I I I I I I 

Db 410 qtsaqllvpkpaips ssvlpsaprdwpvlvssrfvrlswrppae---akgniq 460 

Qy *584 GYTVEYFSPD LQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVP 636 

:ll :M : - II : I hh II I I I I I I I 

Db 461 tftv-ffsregdnreralnt tqpgslqltvgnlkpeamytfrwaynewg---p 510 

Qy ■ 637 SGLSNVIKTIEADFDAASANDLSAARTLLTGKSVELIDASAINASAVRLEWMLHVSADEK 696 

III :: I II : I :::::: I h 

Db 511 gessqpik vatqpelqvpgpvenlqavstsptsilitweppayangp 557 

Qy 697 YVEGLRIHYKDASVPSAQYHSITVMDASAESFWGNLKKYTKYEFFLTPFFETIEGQPSN 756 

hi h : I I :: I: : llhhl : I :: 

Db 558 -vqgyrlfctevstgkeqn ievdglsyk leg lkk f teys lr f lay nrygpgvstd 611 

• 757 SKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYKI — EVSAGNTM 813 
I :l lllllll h : : I : II llll II : Nil : : I 

■ 612 ditwtlsdvpsappqnvslewnsrsikvswlpppsgtqngfitgykirhrkttrrgem 671 

Qy 814 KVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISLFMDPTHHVHPPRA 873 

: II 111:11 :::: I I II 

Db 672 e tlepnnlwylftglekgsqysfqvsamtvngtgp- 706 

Qy 874 HPSGTHDGRHEGQDLTYHNNGNIP--PGD- -INP1THKKTTDYLSGPWLMVLVCIVLLVL 929 

II : II : :| I : I h ||:: 

Db 707 -psnwy taetpendl - - -desqvpdqpsslhvrpqtn ciim— 743 

Qy 930 VISAAISMVYFKRKHQMTRELGHLSWSDNEITALNINS--KESLWIDHHRGWRTADTDK 987 

I I : ::| I : I |:: :| : : : : : 

Db 744 swtppl-npniwrgyilgygvgspyaetvrvdskqryyslerle 787 

Qy 988 DSGLSESKLLSHVNSSQSNYNNSDGGTD-YAEVDTRNLTTFYNCRKSPDNPTPYATTMII 1046 

I II I :lh I I II:: I :| I 

Db 788 ss shyvislkafnnagegvplyesattrsit dptdpvdy 826 

Qy 1047 GTSSSETCTKTTSISADKDSGTHSPYSDAFAGQVPAV PV-VKSNYLQYPVEPIN 1099 

I I I II : II h: I : :: 

Db 827 ypllddfptsvpdlstpmlppvgvqavalthdavrvs 863 



Qy 1100 WSEFLPPPPEHPPPSSTYGYAQGSPESSRKSSKSAGSGISTNQSILNASIHSSSSGGFSA 1159 

h: I :■ | ; |: || ::| ::| 

Db 864 wadnsvpknqktsevrlytvrwrtsfsasakyks edttslsyta 907 

Qy 1160 WGVSP — QYAVACPPENVYSNPLSAVAGGTQNRYQITPTN 1197 

I: I :::l :| h I I I h lh 
Db 908 tglkpntmyef svmv - tknrrsstwsmtahat - - tyeaapts 946 



RESULT 15 ' 
W74152 

ID W74152 standard; Protein; 1257 AA, 
XX 

AC W74152; 
XX 

DT 05-MAY-1999 (first entry) 
XX 

DE Human Ll cell adhesion molecule, 
XX 

KW Human Ll cell adhesion molecule; LlCAM; neurite growth; 

KW nervous system. development; nerve regeneration; 

KW neuronal cell cohesive interaction, 
XX 

OS Homo sapiens, • 
XX 

PN US5872225-A. 
XX 

PD 16-FEB-1999. ' 
XX 

PF 18-NOV-1994; ,94OS-0341843. 
XX 

PR 26-JDN-1992; 92US-0904991. 

PR 18-NOV-1994; 94US-0341843. 
XX 

PA (DYCA-) ONIV CASE WESTERN RESERVE. 
XX 

PI Lemmon V; 

DR WPI? 1999-166719/14. 

DR N-PSDB; X01598. 
XX 

PT Human Ll cell adhesion molecule ■ supports neurite outgrowth and is 

PT involved in nervous system development and repair 

XX 

PS Claim 1; Fig 3;, 45pp; English. 
XX 

CC This sequence is the human Ll cell adhesion molecule (LlCAM) of the 

CC invention, LlCAM supports growth of neurites in vitro and is involved in 

CC development of the human nervous system and in nerve regeneration, It is 

CC useful in in vivo and in vitro experiments on nerve growth and 

CC regeneration. LlCAM mediates cohesive interactions of neuronal cells to 

CC each other and to extracellular matrix. 

XX 

SQ Sequence -1257, AA; 



Query Match , 8.5%; Score 633.5; DB 20; Length 1257; 
Best Local Similarity 23.7*; Pred. No. 3e-28; 

Matches 243; Conservative 142; Mismatches 395; Indels 245; Gaps 36; 

Qy 35 WLLLVLVASNGLPAVRGQYQSPRIIE HPTDLWKKNEPATLNCKVEGKPEPTI 87 

III I : 1 : :|: ::| I III : :| h llll 

Db 9 wpll-lcspclliqipeeyeghhvmeppviteqsprrlvvfptddislkceasgkpevqf 67 

Qy 88 EWFKDGEPVSTNEKKSHRVQFKDGALFFYRTMQGKK — EQDGGEYWCVARNRVGQAVS 143 

Ml I: I : II: I :: I I I I h = l hi 
Db 68 rwtrdgvhfkpkeelgvtvyqsphsgsf-titgnnsnfaqrfqgiyrcfasnklgtams 125 

Qy 144 RHASLQIAVLRDDFRVEPKDT RVAKGETALLECGPPKGIPEPTLIWIKDGVPLDDL 199 

:| :: .: Ihl I :|h :l I II : h 

Db 126 h — eirlmaegapkwpketvkpveveegesvvlpcnpppsaeplriywmns 174 



Qy 200 KAMSFGASSRVRIVDGGNLLISNV- 
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l • II • Ml :|l III 
Db 175 kilhikqdervtmgqngnlyfanvltsdnhsdyichahfpgtrtiiqkepidlrvkatns 234 

Qy 227 226 

Db 235 raidrkprllfptnssshlvalqgqplvleciaegfptptikwlrpsgpmpadrvtyqnhn 294 

Qy 227 DEGNYKCIAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYGQTAIFH 275 

1:1 1:1:1:1 :|: :| : |: II:: :|: : |:|| 
Db 295 ktlqllkvgeeddgeyrclaenslgsarhayyvtveaapywlhkpqshlygpgetarld 353 

Qy 276 CSVGGDPPPKVLWKKEEGNIPVSRARILHDEK SLEISNITPTDEGTYVCEAHNN 329 

I I I I 1:1 I: Ml : hi :| :||: |:| I 
Db 354 cqvqgrpqpevtwr--ingipve--elakdqkyriqrgalilsnvqpsdtmvtqcearnr 409 

Qy 330 VGQISARASL-IVHAPPNFTRRPSNKKVGLNG-WQLPCMASGNPPPSVFWTKEGVSILM 387 
: I I : :M : : : I I I | I I III I I :|:: 

• 410 hglllanayiyvvqlpakiltadnqtpavqgstayllckafgapvpsvqwldedgttvl 469 
388 FPNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPII 447 
I : MM I I:: II I I I : :: |: hi : 
Db 470 — qderffpyangtlgirdlqandtgryfclaandqnnvtiraanlkvkdatq i 520 

Qy 448 QIGPANQTLPKGSVATLPCRATGNPS - -PRIKWFHDGHAVQA- - "GNRYSIIQGSSLRVD 502 

II : III I hi: :ll III II :| ::| |: | : 
Db 521 tqgprstiekkgsrvtftcqasfdpslqpsitwrgdgrdlqelgdsdky-fiedgrlvih 579 

Qy 503 DLQLSDSGTYTCTASGERGET-SWAATLTVEKPGSTSLHRAADPSTYPAPPGTPKVLN-V 560 

Ml I hi II I I I | | II II I: : 

Db 580 sldysdqgnyscvasteldwesraqllwgspg pvprlvlsdlhll 626 

Qy 561 SRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDL-QTGWIVAAHRVGDTQVTISGLTPGT 619 

::: : : h ::: II I :|: :: I hi |:| 

Db 627 tqsqvrvswspaedhn—apiekydiefedkemapekwyslgkvpgnqtsttlklspyv 683 

Qy 620 SYVFLVRAENTQGISVPSGLSNVIRTIEADFDAASANDL SAARTLLTGKSVEL 672 

I I I I I I II :| : I II I I : ::| | : 

Db 684 hytfrvtainkygpgepspvsetwtpea— apeknpvdvkgegnettnmvitwkplrw 740 

Qy 673 IDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDASVPSAQYHSITVMDASAESFWGN 732 

=1 =1 l::l :| I ::: I I II I 

Db 741 mdwnapqvq-yrvgwr pqgtrgpwqeqlv v — sdpflwsn 777 

& 

Qy 733 LKKYT KYEFFLT PFFET IEGQPSNSKT ALT Y EDVPSAPPDNIQ IGMYNQT AGWVRW 788 

: II : : : I : :| II I I h I : I :| hi 
Jb 778 tstfvpyeikv— qavnsqgkgpepqvtigysgedypqaipelegieilnssavlvkw 833 

^/ 789 TPPPSQHHNGNLYGYRIEVSAGNTMKVLA NMTLNATTTSVLLNNLTTGAVYSV 841 

I 1:1 II : : : : :: : I 1 1 1 1 : 1 : I : I : 
Db 834 rpvdlaqvkghlrgynvtyvregsqrkhskrhihkdhvwpanttsvllsglrpyssyhl 893 

Qy 842 RLNSFTKAGDGPYSKPISLFMDPTH-HVHPPRAH PSGTHDGRHEGQ 886 

: =1 I II I: I I III I :|:| I 

Db 894 evqafngrgsgpase—ftfstpegvpghpealhlecqsntslllrwqpplshngvltgy 951 

Qy 887 DLTYH 891 
1:11 

Db 952 vlsyh 956 

Search completed: January 22, 2001, 12:17:03 
Job time: 1580 sec 
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OM protein • protein search, using sw model 
Run on: January 22, 2001, 12:23:24 ; 



Search time 325.28 Seconds 

(without alignments) 

291.200 Million cell updates/sec 



Title: 

Perfect score: 



Scoring table: 



^^arched: 



(JS-09-540-245A-15 
7427 

1 MHPMHPENHAIARSTSTTNN, . 

BLOSOM62 
Gapop 10. C 



..SCLYAEAGEPAPRQMTAKNT 1395 



Gapext 0.5 
195891 seqs, 67900655 residues 
Total number of hits satisfying chosen parameters: 



Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



Database : 



PIRJ6:* 
pirl:* 
pir2:* 
pir3:* 
pir4:* 



Pred. No. is the number of results 
score greater than or equal to thefscore 
and is derived by analysis of the 



Query 



predicted by chance to have a 

of the result being printed, 
,otal score distribution. 



NO, 


Score 


Match Length DB 


ID 


Description 


1 


1609.5 


21,7 


1273 


2i 


T42405 


sax- 3 protein • ca 


2 


1607.5 


21,6 


1612 


2 


T30805 


duttl protein ■ mo 


3 


1585 


21.3 


1651 


2; 


T14160 


transmembrane rece 


4 


1395.5 


18.8 


1344 


2. 


T14316 


rig-1 protein - mo 


5 


826.5 


11.1 


423 


2 


T29549 


hypothetical prote 


6 


790 


10.6 


874 


2 


T29548 


hypothetical prote 


7 


686 


9.2 


1277 


2 


T30532 


neural cell adhesi 


8 


677 


9.1 


1443 


2 


150600 


neogenin - chicken 


9 


661 


8.9 


1896 


2 


T08851 


Down syndrome cell 


10 


658 


8.9 


1040 


2 


A34695 


axonal glycoprotei 


11 


647 


8.7 


1028 


2 


158164 


BIG-1 protein - ra 


12 


645 


8.7 


1028 


2 


A53449 


plasmacytoma -assoc 


13 


644.5 


8.7 


1040 


2 


A49356 


transient axonal g 


14 


644 


8.7 


1272 


2 


S26180 


neurofascin - chic 


15 


641.5 


8.6 


1260 


1 


S05479 


neural cell adhesi 


16 


637.5 


8.6 


1447 


2 


A54100 


tumor suppressor p 


17 


636 


8.6 


2222 


2 


T13924 


sdk protein - frui 


18 


633.5 


8.5 


1257 


1 


A41060 


neural cell adhesi 


19 


632 


8.5 


1259 


2 


A43425 


Bravo/Nr-CAM cell 


20 


627 


8.4 


1259 


2 


S36126 


neural cell adhesi 


21 


626.5 


8.4 


1036 


2 


S22383 


axonin 1 precursor 


22 


624 


8.4 


1268 


1 


A39640 


neural cell adhesi 


23 


622,5 


8.4 


1427 


2 


151669 


tumor suppressor - 


24 


619.5 


8.3 


1018 


2 


JC4211 


neural adhesion pr 


25 


614.5 


8.3 


1018 


2 


A54744 


contactin 1 precur 


26 


610.5 


8.2 


1375 


2 


T13822 


frazzled gene prot 


27 


604 


8.1* 1239 


1 


A32579 


neuroglian - fruit 


28 


602.5 


8.1 


1020 


2 


S05944 


neuronal cell surf 


29 


600.5 


8.1 


1021 


2 


A57112 


contactin precurso 



30 


595.5 


8,0 


1010 


JUUlm 


Fll protein precur 


31 


595 5 


8 0 




cm qqp 


contactin precurso 


\\ 




S'o 




t/ inn 


neural cell adhesi 




583 


7,8' 


1907 


SSUorJ 


protein -tyros ine ■ p 


34 


568 


7.6 


1197 


TJU581 


neural cell adhesi 






7 .'6. 




S46216 


leukocyte antigen - 


36 


561 


7.6 


4391 




perlecan precursor 


37 


559 


7.5 


1209 


T42718 


probable neural ce 


38 


555.5 , 


. 7.5 


1526 


T13823 


frazzled gene prot 


39 


547 


7.4 


1863 


S46217 


protein-tyrosine-p 


40 


546 


7.4 


1897 


TDHCJLK 


leukocyte antigen - 


41 


543 


'7. ,3 


1912 


A56178 


protein-tyrosine-p 


42 


534 


7,2/ 


7962 


138346 


elastic titin - hu 


43 


531.5 


7.2 


1501 


158148 


protein-tyrosine-p 


44 


530.5 


7.1, 


1256 


T03096 


CDO protein - rat 


45 


530.5 


7.1 


2029 


TDFFLK 


protein-tyrosine-p 



RESULT 1 
T42405 

sax-3 protein - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 03-Dec-1999 tsequence_revision 03-Dec-1999 #text_change 2Wul-2000 
C; Accession: T42405 

R;Zallen, J. A.; Yi, 9, A.; Bargmann, C.I. 
Cell 92, 217-227, 1998 

A;Title: The conserved immunoglobulin superfamily member SAX-3/Robo directs multiple 
A;Reference number: Z22160; MUID: 98117250 
A; Accession: T42405 • 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-1273 <ZAL> 

A; Cross-references: EMBL: AF041053; HID:g2804779; PIDN:AAC38848.1; PID:g28C478C 
C; Genetics: 
A; Note: sax-3 
C; Function: 

A; Description: sax-3 function is required at the time of axon guidance 



Query Match 21.7*; Score 1609.5; DB 2; Length 1273; 

Best Local Similarity 31.6%; Pred. No. 7.2e-74; 

Matches 422; Conservative 195; Mismatches 526; Indels 191; Gaps 

Qy 55 SPRI IEHPTDLWKKNEPATLNCKVEGKPEPT - IEWFKDGEPVSTNEKK ■ -SHRVQFKDG 111 

: Mill |:|| : HIM II I hllhll II::: III: I 
Db 30 APVIIEHPIDVWSRGSPATLNC--GAKPSTAKITWYKDGQPVITNKEQVNSHRIVLDTG 87 

Qy 112 ALFFYRTMQGX--KEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKG 169 

:|| : 111:11 hill I h I 1 1 : : 1 : 1 1 : 1 1 1 1 I: : I 
Db 88 SLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGG 147 

Qy 170 ETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEG 229 

I 1:111 11:1 III : I II I : I : III:! I: I I 
Db 148 EMAVLECSPPRGFPEPWSWRKDD KELRIQDMPRYTLHSDGNLIIDPVDRSDSG 201 

Qy 230 NYKCIAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWK 289 

1:1: |:|| I |: |:| I It I :|||| : I I I I III |:: 
Db 202 TYQCVANNMVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWK 261 

Qy 290 KEEGNIPVSRARILHDERSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTK 349 

:: :|l:ll I I : I I : hill III I I I : I I I I llhl 
Db 262 RKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQT 321 

Qy 350 RPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPN--SSHGRQYVAADGTLQIT 407 

:|::: II I I I h Ihlll hlh h II h III I 
Db 322 KPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIE 381 

Qy 408 DVRQEDEGYYVCSAFSWDSSTVRVFLQVS SVDERPPPIIQIGPANQTLPKGSV 461 

:IM III III: : il : hi: : :||| h I llll II 
Db 382 EVRQVDEGAYVCAGMNSAGSSLSKAALKVTTKAVTGNTPAKPPPTIEHGHQNQTLMVGSS 441 
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Qy 462 ATLPCRATGNPSPRIKWFHDGHAVQ-AGNEYSIIQGSSLRVDDLQLSDSGTYTCTASGER 520 

t I 1 1 1 : 1 : 1 hi III!: :| I II : l|: hi III I I 
Db *442 AILPCQASGRPTPGISWLRDGLPIDITDSRISQHSTGSLHIADLRRPDTGVYTCIARNED 501 

Qy 521 GETSWMTLTVEKPGSTS-LHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAV 579 

|::|:|:MI I : I III :|: I I ::!h ! : I I : 
Db 502 GESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPI IVNVTDTEVELHW- -NAPSTSGA 559 

Qy 580 GPIIGYTVEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGL 639 

Ml II -Mill I I I: I II I ll:!::IHI :ll II 
Db 560 GPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGLKPSHSYMFVIRAENEKGIGTPSVS 619 

Qy 640 SNVIKTI EADFDAAS AN — DLSAARTLLTGKS -VELIDASAINASAVRLEWMLHVSAD 694 

I :: I : I :: |:: I II : ::| : l|::|||| I 
Db 620 SALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEEVKTINSTAVRLFWKKRKL-- 677 



Qy 



I 



Db 



Qy 



695 EKYVEGLRIHYK-DASVPSAQYHSITVMDASAESFWGNLKKYTKYEFFLTPF—FETI 750 

I: "I I II : I I l::|l II :| III: |: :| 

678 EELIDGYYIKWRGPPRTNDNQY--VNVTSPSTENYWSNLMPFTNYEFFVIPYHSGVHSI 735 

751 EGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYKIEVSAG 810 

I llll II I II I:::: III : I I : III |::| I I 

736 HGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWKAPKADGINGILKGFQI-VIVG 794 

811 NTHKVLANMTLHATTTSVLLNHLTTGAVYSVRLNSFTKAGDGPYSKPISLFMD-PTHHVH 869 

Ml II I :l II I :h : : I I : |: | | 

795 QAPNNNRNITTNERAASVrLFHLVTGMTYKIRVAARSNGGVGVSHGTSEVIMNQDTLEKH 854 

870 PPRAHPSGTHDGRHEGQDLTYH- -NNGNIPPGDINPTTHRKTTDYLSGPWLMVLVCIVLL 927 

: I : I I ::| , |:| : :| 

855 LA AQQENESFLYGLINRSHVP VIVIVAIL 883 

928 VLVISAAISMVYFKRKHQMTKELGHLSWSDNEIIALNINSKESLW IDHHRGWRT 982 

:: : I: I:: : ::| : : | || : : : | 

884 IIFVVIHAYCYWRNSRNSDGKDRSFIKINDGSVHMASNN — LWDVAQNPNQNPMYNT 939 



Qy 983 ADTDKDSGLSESKLLSHVNSSQSNYNNSD-GGT DYAEV— DTRNLTTF 1027 

I : : I I ::| :|| | I |:: :;| 

Db 940 AGRMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTGGPGNAMSTF 999 



1028 YNPRKSPDNPTPYATTMIIGTSSSEICTKTTSISADKDSGTHSPYSDAFAGQVPAVPWK 1087 

I ; : MM! :: : I l| : 

1000 YG-NQYHDDPSPYATTTLV LSNQQPA- "WLN 1027 

1088 SNYLQYPVEPINWSEFLPPPPEHPPPSSTYGYAQGSPESSRKSSKSAGSGISTNQSILNA 1147 

1:11 I III II: : I I! I I I I || 
1028 DKMLRAPAMPTN PVPPE - - PPARYADHTAG - - RRSRSSRASDGRG TLNG 1072 

1148 SIHSSSSGGFSAWGVSPQYAVACPPENVYSNPtSAVAGGTQNRYQITPTNQHPPQL 1203 

:| :ll : III : II : : I : II 

1073 GLHHRTSGSQRS DSPPHTDVSYVQLHSSDGTGSSKERTGERRTPPNRTLMD 1123 

1204 - - -PAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPM 1260 

I III I : 11:1 II : : 
1124 FIPPPPSNPPPPGGHVYDDIFQTATRRQ LNRGSTPREDTYDS 1165 



I :| : I I: UN: I | : | ::: | 

■ -VSDGAFARVDVNA- -RPTSRNRNL- - -GGRPLKGKRDDDSQRSSLMMDDDGG 1212 



Qy 1321 SAKQRGGHHRRRAP 1334 

I:: I : I 
Db 1213 SSEADGENSEGDVP 1226 



2 



T30805 

duttl protein - mouse 

N; Alternate names: transmembrane receptor protein Robol homolog 
C;Species: Mus musculus (house mouse) 

C'Date: 22-Oct-1999 fsequencejrevision 22-Oct-1999 ttext_change 22-Oct-1999 
C;Accession: T30805 



R;Wu, M,C; Lowe, N.j Fordham, R.; Rabbitts, P. 
submitted to the EM8L Data Library, July 1998 

A; Description: The mouse homologue of human DUTTl/H-robol gene: protein sequence and 
A; Reference number: Z20879 
A; Accession: T30805 ,' 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1612 <wum> 

A; Cross-references: EMBL:Y17793; NID:el329712; PID:el329713; PIDN:CAA76850,1 
A; Experimental source: brain 
C; Genetics: 
A; Gene: duttl 
A; Map position: 16 



Query Match 21.6%; Score 1607.5; DB 2; Length 1612; 

Best Local Similarity 30.9%; Pred. No. 1.2e-73; 

Matches 426; Conservative 189; Mismatches 485; mdels 277; Gaps 

Qy 56 priieSptdlwkknepatlnckvegkpeptiewfkdgepvst—nekkshrvqfkdgal 113 

ll:IM:!l: I 1 1 1 1 1 1 1 1 iMllllM Mil :: :|||: |:| 

Db 29 privehpsdlivsrgepatlncraegrptptiewyrggervetdrddprshrmllpsgsl 88 

Qy 114 FFYRTMQGKKEQ-DGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPRDTRVAKGETA 172 

II I : hi': I I I llhl :hlll :|lh:|:|lllll II II III 
Db 89 FFLRIVHGRKSRPDEGVYICVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 148 

Qy 173 LLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYK 232 

"II II: llh I III MM |: I I |:|: I I I 

Db 149 VMECQPPRGHPEPTISWKRDGSPLDD KDERIT I - RGGKLMITYTRKSDAGKYV 200 

Qy 233 CIAQNLVGIRESSYAKLIVQVKPYFMKEPRDQVMLYGQTATFHCSVGGDPPPKVLWKKEE 292 

h hll III hi I :| |:| I ; : :| I I Mill hh: 
Db 201 CVGTNMVGERESEVAELTVLERPSFVRRPSNLAVTVDDSAEFRCEARGDPVPTVRWRRDD 260 

Qy 293 GNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTRRPS 352 

I :l II I I: :hl =1 I hi I II lh I hi I Ihl :| 
Db 261 GELPKSRYEI-RDDHTLRIRRVTAGDMGSYTCVAENMVGRAEASATLTVQEPPHFWRPR 319 

Qy 353 NKKVGLNGWQLPCMASGNPPPSVFWTREGVSTLMF— PNSSHGRQYVAADGTLQITDV 409 

:: I I I I hill ::M :|| hi I I lh I I Ihl 
Db 320 DQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNV 379 

Qy 410 RQEDEGYYVCSAFSWDSSTVRVFLQVSSV-DERPPPIIQIGPANQTLPRGSVAILPCRA 468 

:: I M:l :| I : :|:|: I :||||:|: I |||: | | 
Db 380 QRSDVGYYICQTLPAGSIITRAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTLILSCVA 439 

Qy 469 TGNPSPRIKWFHDGHAVQA-GNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAA 527 

hi: I I II I : :: |:: ;| hi |||||| I :h! 
Db 440 TGSPAPTILWRRDGVLVSTQDSRIKQLESGVLQIRYAKLGDTGRYTCTASTPSGEATWSA 499 

Qy 528 TLTVERPG-STSLHRAADPSTYPAPPGTPRVLNVSRTSISLRWARSQERPGAVGPIIGYT 586 

: h: I... I lh h I hi :|h :::| II: I 
Db 500 YIEVQEFGVP,VQPPRPTDPNLIPSAPSRPEVTDVSRNTVTLSW- - -QPNLNSGATPTSYI 556 

Qy 587 VEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTI 646 

:| II : I II I llll Mllll I III II :l: :|| 
Db 557 IEAFSHASGSSWQTAAENVRTETFAIKGLRPNAIYLFLVRAANAYGISDPSQISDPVRTQ 616 

Qy 647 EADFDAASANDLSAARTLLTGRSV-ELIDASAINASAVRLEWMLHVSADEKYVEGLRIHY 705 

: : : v Mill:: :;:|:| : | | ;|;:| :| | 
Db 617 DVPPTSQGVDHRQVQREL- -GNWLHLHNPTILSSSSVEVHWT- -VDQQSQYIQGYKILY 672 

Qy 706 R--DASVPSAQYHSITVMDASAESFWGNLRRYTRYEFFLTPFFETIEGQPSNSKTALTY 763 

: II ::: I : I h Ml II III :| I I II 
Db 673 RPSGASHGESEWLVFEVSTPTKNSWIPDLRKGVNYEIRARPFFNEFQGADSEIRFAKTL 732 

Qy 764 EDVPSAPPDNIQIGMY--NQTAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMRVLANMTL 821 

h Hill :: : Ml I I III II : lh III I h 
Db 733 EEAPSAPPRSHVSKNDGNGTAILVTWQPPPEDTQNGMVQEYKV-WCLGNETKYHINKTV 791 

Qy 822 NATTTSVLLNNLTTGAVYSVRLNSFTRAGDGPYSKPISLFMDPTHHVHPPRAHPSGTHDG 881 

: :! lh: :| I II : M III hi : :| 
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Db 792 DGSTFSWIPSLVPGIRYSVEVAASTGAGPGVKSEPQFIQU)-- 



■ 833 



Qy 882 RHEGQDLTYHNNGN-IPPGDINPTTHRRTTDYLSGP WLMVLVCIVLLVLV 930 

"II : I I : :| : I |::::| : I 

Db 834 SHGNPVSPED-QVSLAQQISDWRQPAFIAGIGMCWIILMVPSIWL— 879 

Qy 931 ISAAISMVYFKRKHQ MTRELGHLSWSDNEITALNINSR 969 

Ml: :| : I :| I III: 

Db 880 YRHRKKRNGLTSTYAGIRRVPSFTFTPTVTYQRGGEAVSSGGRPGLLNISEP 931 

Qy 970 ESL-WI DHHRGWRTADTDKDSGLSESKLLSHVNSSQ * - SNYNNS 1010 

: I: ::| :| hi I :: : :|||| 

Db 932 ATQPWLADTWPNTGNNHNDCSINCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNKQTNL 991 

Qy 1011 - -DGGTDYAEVDTRNLTTFYNCRKSPD NPTPYATTMII 1046 

I I :M I II: Nlllt :| 

Db 992 MLPESIVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYMTQLIQANLSNNMNNG 1051 

• 1047 -GTSS SETCTKTTSISADKDSGTHSPY 1072 
Ml 

Db 1052 AGDSSEKHWKPPGQQKPEVAPIQYNIMEQNKiNRDYRANDTIPPTIPYNQSYDflNTGGSY 1111 

Qy 1073 SDAFAGQVPAVPWKSNYLQYPVEP INWSEFLPPPPEHPPPSSTYGYAQGSPESSR 1128 

- I : : I I :||:: MM Ml I 
Db 1112 NSSDRGSSTSGSQGHRRGARTPKAPKQGGMNWADLLPPPPAHPPPHS 1158 

Qy 1129 RSSRSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQ 1188 

I lh I : II :| I 
Db 1159 NSEEYNMSVDES YDQEMPCPVPPAPMYLQ Q 1188 

Qy 1189 NRYQITPTNQHPPQLPAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRAC 1248 
. : I :: I I I : I II :| II 

Db 1189 DELQ - EEEDERG PTPPVRGAASSP • AAVSYSHQSTAI 1223 

€> 

Qy 1249 NSCDALATPSP---MQP PPPVPVPEGWYQPVHPNSHPMHPTSSNH 1290 

llll Ml I I I: III I | | | | 

Db 1224 LTPSPQEELQPMLQDCPEDLGHMPHP- ■ -PDRRRQPVSP-PPPPRPISPPH 1270 



RESULT 3 
T14160 

transmembrane receptor protein Robol • rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 20-Sep-1999 «sequence_revision 20-Sep-1999 ftext.change 20-Sep-1999 
C;Accession: T14160 

•Kidd, T.; Brose, K.; Mitchell, K.J.; Fetter, R.D.; Tessier-Lavigne, M.; Goodman, c.S, 
11 92, 205-215, 1998 



IllS 

'Title: Roundabout controls axon crossing of the CNS midline and defines a novel subfan 
A;Reference number; Z17897; MOID: 98117249 
Accession: T14160 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1651 <KID> 

A;Cross-references: EMBL:AF041082; NID:g2811215; PID:g2811216; PIDN:AAC39960.1 

C; Function: 

A; Description: appears to function as the gatekeeper controlling midline crossing 
C; Keywords: transmembrane protein 



Query Match 21.3%; Score 1585; DB 2; Length 1651; 

Best Local Similarity 29.0%; Pred. No, Ue-72; 

Matches 456; Conservative 207; Mismatches 538; Indels 374; Gaps 49; 

Qy 44 NGLPA VRGQYQSPRIIEHPTDLWRRNEPATLNCRVEGKPEPTI 87 

II II O : IIMMIMIM I llllllll Ml III 

Db 40 NGTPAPTSDNDDNSLGYTGSRLRQEDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPII 99 

Qy 88 EWFKDGEPVST--NEKKSHRVQFKDGALFFYR1MQGKKEQ-DGGEYWCVAKNRVGQAVSR 144 

MM || | | :: Mil: |:||| I : |:| : | I | MM MMI 
Db 100 EWYKGGERVETDKDDPRSHRMLLPSGSLFFLRIVHGRKSRPDEGVYICVARNYLGEAVSH 159 

Qy 145 HASLQIAVLRDDFRVEPKDTRVAKGETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSF 204 

:MI::|:MIIII II II II MM ||:| IMM I I llll 



Db 160 NASLEVAILRDDFRQNPSDVMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDD-- 



Qy 205 GASSRVRIVD3GNLLISNVEPIDEGNYKCIAQNLVGTRESSYARLIVQVRPYFMKEPKDQ 264 

I: I M |:|: III: Ml III I : I M |:| | : 
Db 214 •KDERITI-RGGKLMITYTRKSDAGKYVCVGTNMVGERESKVADVTVLERPSFVKRPSNL 271 

Qy 265 VMLYGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDEKSLEISNIIPTDEGTYVC 324 

: M |,| HI | |:|::| :| II I |: :|:| :| I Ml I 
Db 272 AVTVDDSAEFKCEARGDPVPTFGWRKDDGELPKSRYEI-RDDHTLKIRKVTAGDMGSYTC 330 

Qy 325 EAHNNVGQISARASLIVHAPPNFTKRPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVS 384 

I I lh I hi I MM :| :: I I I I IMM MM M 
Db 331 VAENMVGRAEASATLTVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQ 390 

Qy 385 TLMF—PNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSV-D 440 

hi I 1 I h I I MM:: | 1 1 1 : | :| | : :|:|: | 
Db 391 NLLFSYQPPQSSSRFSVSQTGDLTVTNVQRSDVGYYICQTLNVAGSIITKAYLEVTDVIA 450 

Qy 441 ERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFHDGHAVQA-GNRYSIIQGSSL 499 

MMMM hi III: II I MM MINI :| :: I 
Db 451 DRPPPVIRQGPVNQTVAVDGTLTLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLESGVL 510 

Qy 500 RVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPG-STSLHRAADPSTYPAPPGTPKVL 558 

M |:|' Mill I MM : h: I I ||: |: I Ml 
Db 511 QIRYARLGDT3RYTCTASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVT 570 

Qy 559 NVSRTSISLRSAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQVTISGLTPG 618 

MM :::| I I I :| II : I I I I II I 

Db 571 DVSKNTVTLLW— QPNLNSGATPTSYIIEAFSHASGSSWQTVAENVKTETFAIKGLKPN 627 

Qy 619 TSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSAARTLLTGKSV-ELIDASA 677 

hlllll I III II :|: :M : : IMM:: 

Db 628 AIYLFLVRAANAYGISDPSQISDPVKIQDVPPTTQGVDHKQVQREL-'GNWLHLHNPTI 685 

Qy 678 INASAVRLEWMLHVSADEKYVEGLRIHYK-DASVPSAQYHSITVMDASAESFWGNLKR 735 

:::|:| : I" I :|::| :| h II ::: I M h :|:| 
Db 686 LSSSSVEVHWT--VDQQSQYIQGYKILYRPSGASHGESEWLVFEVRTPTKNSWIPDLRK 743 

Qy 736 YTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMY--NQTAGWVRWTPPPS 793 

II III M I MM Mill :: : I III I III 
Db 744 GVNYEIKMPFFNEFQGADSEIRFARTLEERPSAPPRSVTVSKNDGNGTAILVTWQPPPE 803 

Qy 794 QHHNGNLYGYRIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGP 853 

II : lh II : I h: M ||:: I I III :: I II I 
Db 804 DTQNGMVQEYRV-WCLGNETRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTGAGPGV 862 

Qy 854 YSKPISLFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGN-IPPGDINPTTHRRTTDY 912 

hi .:. :h I : :: :| 

Db 863 KSEPQFIQLD- : SHGNPVSPED-QVSLAQQISDV 893 

Qy 913 LSGP — WLMVLVCIVLLVLVISAAISMVYFRRRHQ 945 

: I ; I::::| : I Ml : 

Db 894 VKQPAFIAGIGAACWIILMVFSIWL — YRHRKKRNGLSSTYAGIRKVPSFT 942 

Qy 946 — MTRELGaLSWSDNEITALNINSKESL-WIDHHRGW-RTADTDKD SG 990 

:| : I. :l I llh M: II =M :| 
Db 943 FTPTVTYQRGGEAVSSGGRPGLLNISEPATQPWLAD--TWPNTGNSHNDCSINCCTASNG 1000 

Qy 991 LSESKLLSHVSSSQ- -SNYNNS DGGTDYAEVDTRNLTTFYNCRKSPD— 1035 

hi I :: ■ : MM I I Ml I M 

Db 1001 NSDSNLTTYSRPADCIANYNNQLDNRQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKD 1060 

Qy 1036 NPTPYATTMII GTSSSE 1052 

mini n i in 

Db 1061 GRFVNPSGQPTPYATTQLIQANLINNMNNGGGDSSERHWRPPGQQRQEVAPIQYNIMEQN 1120 

Qy 1053 TCTKTTSISADKDSGTHSPYSDAFAGQVPAVPWKSNYLQYPVEP — I 1098 

I I : II h : I : : I I : 

Db 1121 RLNRDYRANDIILPTIPYNHSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPRAPKQGGM 1180 

Qy 1099 NWSEFLPPPPEHPPPSST YGYAQGSP ESSRKSSK 1132 

lh: Mill llll I III I : : 

Db 1181 NWADLLPPPPAHPPPHSNSEEYSMSVDESYDQEMPCPVPPARMYLQQDELEEEEAERGPT 1240 
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Qy 1133 SAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVA CPPENVYSNPLSAVAGGT 1187 

I :::::: I I I: : III : II : I : 
Db 1241 PPVRGAASSPAAVSYS ■ HQST — ATLTPSPQEELQPMLQDCPED LGHMPHPP 1289 

Qy 1188 QNRYQITPTNQHPPQLP AYFATTGP 1212 

I I I : II I I :M 
Db 1290 DRRRQ--PVSPPPPPEPISPPHTYGnSGPLVSDMDTDAPEEEBDEADMEVAKMQTRRLL 1347 

Qy 1213 GGAVPPNHL PFATQR 1227 

I I ::: II 
Db 1348 LRGLEQTPASSVGDLESSVTGSMINGWGSASEEDNISSGRSSVSSSDGSFFTDADFAQAV 1407 

Qy 1228 HAASEYQAGLNAARCAQSRACNSCDALAT PS PMQPPPPVPVPEGWYQPVHPNSHPMHPT S 1287 
I hl;l III II I hi : || I 

Db 1408 AAAAEY - AGLKVARRQMQDAAGRRHFHASQCP -RPTSPV STD 1447 

i 

Qy 1288 SNHQHQCSSECSDHSRSSQSHKRQLQLEEHGSSAKQRGGHHRRRA PWQPC 1339 

II | I: :| :: I I II III III 

Db 1448 SN- - - - -MSAAVIQKARPTKXQKHQ PGHLRREAYTDDLPPPPVPPPA 1489 



1340 MESE^ENMLAEYEQR 1354 

::l =; Ml I 
1490 IKSPSYQSKAQLEAR 1504 



RESULT 4 
T14316 

rig-1 protein - mouse 

C;Species: Mus Jnusculus (house mouse) 

C;Date: 20-Sep-J1999 #sequence_revision 20-Sep-1999 *text_change 20-Sep-1999 
C; Accession: 114316 

R;Yuan, S.S.F.;jCox, L.A.; Dasika, G.K.; Lee, E.T.H.P. 
submitted to the EMBL Data Library, April 1998 
A; Reference number: Z17975 
A; Accession: T14316 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-1344 <YUA> 

AjCross-references: EMBL:AF060570; MD:g4206385; PID:g4206386; PIDN: AAD11628 .1 



Query Match t 18.8*; 
Best Local Similarity 29.1%; 
Matches 415;; Conservative If 



Score 1395.5; DB 2; Length 1344; 
Pred. No. 5.3e-63; 

3; Mismatches 560; Indels 273; Gaps 



Qy 38 LVLVASNGLP- *AVRGQYQ SPRIIEBPTDLWKKNEPATLNCKVEG 81 

11:1111:1: llhl I Mil : Mill h II 

Db 8 LTLQSKPGLPPVALPGYLELPSSPGSRVGPEDAMPRIVEQPPDLWSRGEPATLPCRAEG 67 

•82 KPEPTIEWFKDGEPVST--NEKKSHRVQFKDGALFFYRTMQGKREQ-DGGEYWCVAKNRV 138 
:| I MhMI |:| : ::||: Ml! I : |:: : I I I llhl : 
Db 68 RPRPNIEWYKNGARVATAREDPRAHRLLLPSGALFFPRIVHGRRSRPDEGVYTCVARNYL 127 

Qy 139 GQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETALLECGPPKGIPEPTLIWIKDGVPLDD 198 

I I Ihllhhhllhl I : || || |::|| MM III : I I : I : 
Db 128 GAAASRNASLEVAVLRDDFRQSPGNVWAVGEPAVMECVPPKGHPEPLVTWKKGKIRLKE 187 

Qy 199 LKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKCIAQNLVGTRESSYAKLIVQVKPYFM 258 

I: I II h:h I I I hi M I Ml MMI :| I: 
Db 188 EEGRITI-RGGKLMMSHTFKSDAGMYMCVASNMAGERESGAAELWLERPSFL 239 

Qy 259 KEPKDQVMLYGQTATFHCSVGGDPPPKVLWRKEEGNIPVSRARILHDEKSLEISNITPTD 318 

: I MMI | | | III | : |:|::| :| I I I II I :: | 
Db 240 RRP INQWLADAPVNFLCEVQGDPQPNLHWRKDDGELPAGRYEIRSDH- SLWIDQVSSED 298 

Qy 319 EGTYVCEAHNNVGQISARASLIVHAPPHFTKRPSNKKVGLNGWQLPCMASGNPPPSVFW 378 

MM I I MM: I II II II I h h I I I 1 1 1 1 1 :: 1 1 
Db 299 EGTYTCVAENSVGRAEASGSLSVHVPPQFVTRPQNQTVAPGANVSFQCETKGNPPPAIFW 358 

Qy 379 TKEGVSTLMFPNSS— HGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQ 435 

III Mil: I II I: I I lhl: | ||||| I II I : |: 
Db 359 QKEGSQVLLFPSQSLQPMGRLLVSPRGQLNITEVKIGDGGYYVCQAVSVAGSILAKALLE 418 



Qy 436 V* * SSVDERPPPI IQ IGPANQT LPKGSVAT LPCRATGNPS PRI KWFHDGHAVQA - GNRYS 492 

: :hl li Ml IMMM II MM Ml I Ml I M :::: 

Db 419 IKGASIDGLPPIILQ-GPANQTLVLGSSVWLPCRVIGNPQPNIQWKKDERWLQGDDSQFN 477 

Qy 493 IIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTSLHRAADPSTYPAPP 552 

:: :| : h I I Ml I II :| : I :: I I II I II 

Db 478 LMDNGTLHIASIQEMDMGFYSCVAKSSIGEATWNSWLRKQEDWGASPGPATGPSNPPGPP 537 

Qy 553 GTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQVTI 612 

I I M lh 11:11 I :| II III II 

Db 538 SQPIVTEVTAHSITLTW-KPNPQSGATA-TSYVIEAFSQAAGNTWRTVADGVQLETYTI 594 

Qy 613 SGLTPGTSYVl'LVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSAARTLLTGKSVEL 672 

III I I Mill Ml II :| ::!::: I I : I :| : 

Db 595 SGLQPNTIYLFLVRAVGAWGLSEPSPVSEPVQTQDSSL-SRPAEDPWKGQRGLAEVAVRM 653 

Qy 673 IDASAINASATOLEMMLHVSADEKYVEGLRIHYKDASVPSAQYHSITVMDASAESFVVGN 732 

: : : '::: | | : |:| |: :: | : : : : :| h 

Db 654 QEPTVLGPRTLQVSWT ■ -VDGPVQLVQGFRVSWRIAGLDQGSWTMLDLQSPHKQSTVLRG 711 

Qy 733 LKKYTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQI - -GMYNQTAGWVRWTP 790 

I : : : I I I: II II : : I :: II I 

Db 712 LPPGAQIQIKVQVQGQEGLGAESPFVTRSIPEEAPSGPPQGVAVALGGDRNSSVTVSWEP 771 

Qy 791 PPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAG 850 

I II : Ml II : I : II : I I :| : : I II 

Db 772 PLPSQRNGVITEYQI-WCLGNESRFHLNRSAAGWARSVTFSGLLPGQIYRALVAAATSAG 830 

Qy 851 DGPYSKPISLFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTTHKKTT 910 

MM:: II ::: :: 

Db 831 VGVASAPVLVQLP FPPAAEPG PEVSEGLAERLA 863 

Qy 911 DYLSGPWLMV--LVCIVLLVLVISAA1SMVYFKRKHQMTKELGHLSWSDNEITALNIN 967 

II: III:: |: : ||| | : | :: 

Db 864 KVLRKPAFLAGSSAACGALLLGFCAA LYRRQKQRKELSHYT ■ ASFAYTPAVSFP 916 

Qy 968 SKESL HI DHHRGW— RTADTDKDSGLSES 994 

II' M II III : I : 

Db 917 HSEGLSGSSSRPPMGLGPAAYPWLADSWPHPPRSPSAQEPRGSCCPSNPDPD-DRYYNEA 975 

Qy 995 KLLSHVNSSQSNYNNSDGGTDYAEVDT--RNLTTFYNCRKSPDNPTPYATTMIIGTSSSE 1052 

: :: : I I I M :| MM : I : II 

Db 976 GISLYLAQTARGANASGEGPVYSTIDPVGEELQTFHGGFPQHSSGDPSTWSQYAPPEWSE 1035 

Qy 1053 TCTKTTSISADKDSGTHSPYSDAFAGQVPAVPWKSNYLQYPVE-PINWSEFLPPPPEH 1110 

Ml I II I IM M | Ml 

Db 1036 GDSG ARGGQ GKLLGKPVQMPSLSWPEALPP 1065 

Qy 1111 PPPSSTYGYAQGSPESSRKSSRSAGSGIS-TNQSILNASIHSSSSGGFSAWGVSPQYAV 1168 

Mil M II I I M I : Mill 

Db 1066 PPPSCELSCPEG - PEEELKGSSDLEEWC PPVPEKSHL - - - VGSSSSGA CM 1111 

Qy 1169 ACPPENVYSNPLSAVAGGTQNRYQITPTNQHPPQLPAYF ATTGPGGAVPPN 1219 

I :'! M I M MM III I M 
Db 1112 VAPAPRDTPSPTSSY-GQQSTATLTPSPPDPPQPPTDIPHLHQMPRRVPLGPSS 1164 

Qy 1220 HLPFATQRHAASEYQ- ■ - AGLNAARCAQ SRAC NSCDALAT PS PMQ 1261 

I : : I M III MUM 

Db 1165 - - PLSYSQPALSSHDGRPVGLGAGPVLSYH ASPSPVPSTASSAPGRTRQVTG 1214 

Qy 1262 ■' '— PPPRVPVPE GWYQPVHPNS 1280 

llhl II Mil: 

Db 1215 EMTPPLHGHRARIRKKPKALPYRREHSPGDLPPPPLPPPELRDKLALGSAGSRQHVFPRA 1274 



Qy 1281 
Db 1275 RA- 



II II 
■•QWGEESGAGSAS- 



1328 
I :: III I 
RGPTSSQRGPH 1299 



RESULT 5 
T29549 

hypothetical protein ZK377. 3 ■ Caenorhabditis elegans 
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C;Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 tsequence_revision 15-Oct-1999 itext_change 18-Feb-2000 
C;Acc*sslon: T29549 
R;Nhan, M.; Hawkins, & 

submitted to the EMBL Data Library, February 1997 
A; Description: The sequence of C. elegans cosmid ZK377. 
A; Reference number: Z20639 
A; Access ion: T29549 

A; Status: preliminary; translated from GB/EMBL/DDBJ . 
A; Molecule type: DNA 
A;Residues: 1-423 <NHA> 

A;Cross-references: EMBL:D88183; PIDN:AAB52658.1; GSPDB :GN00028; CESP:2K377 .3 
A; Experimental source: strain Bristol N2; clone ZK377 
C; Genetics: 
A;Gene: CESP:ZR377.3 
A; Map position: X 

A;Introns: 24/1; 142/3; 229/3; 284/2; 408/3 



^Query Match 11.14; Score 826.5; DB 2; Length 423; 

Best Local Similarity 42.84; Pred. No. 7.4e-35; 
Matches 167; Conservative 61; Mismatches 147; Indels 15; Gaps 



fl 



55 SPRIIEHPTDLWKKNEPATLNCKVEGKPEPT-IEWFKDGEPVSTNEKK-SHRVQFKDG 111 

: lllil 1:11 : INI! II I hllhll II::: III: I 
29 APVIIEHPIDVWSRGSPATLNC-GAKPSTAKITWYKDGQPVITNKEQVNSHRIVLDTG 86 

112 ALFFYRTMQGR • - REQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVARG 169 

:|| : 111:11 hill I h I 1 1 : : 1 : 1 1 : 1 1 1 1 h : I 
87 SLFLLRVNSGKNGRDSDAGAYYCVASNEHGEVKSNEGSLRLAMLREDFRVRPRTVQALGG 146 

170 ETALLECGPPRGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEG 229 

I hill II: |||:IH It I : III: h I I 

147 EMAVLECS PPRGFPEPWSWRKDD KELRIQDMPRYTLHSDGNLIIDPVDRSDSG 200 

230 NYRCIAQNLVGTRESSYARLIVQVRPYFMREPRDQVMLYGQTATFHCSVGGDPPPRVLWK 289 

I ■' I : I 1:1 I h hi I II I :|||| I I II III h: II 
201 TYOCVANNMVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWK 260 

290 KEEGNIPVSRARILHDEKSLEISNITPTDEGTYVCEABNNVGQISARASLIVHAPPNFTK 349 

:: :ihll I I : I I : hill III I I I : I I I I llhl 
261 RKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQT 320 

350 RPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPN--SSHGRQYVAADGTLQIT 407 

:|::: II III: Ihlll hlh h II h III I 

321 KPADQSVPAGGTATFECTLVGQPSPAYFWSREGQQDLLFPSYVSADGRTRVSPTGTLTIE 380 

408 DVRQEDEGYYVCSAFSVVDSSTVRVFLQVS 437 

:||| III III: : II : h : 
381 EVRQVDEGAYVCAGMNSAGSSLSRAALKAT 410 



RESULT 6 
T29548 

hypothetical protein ZK377.2 • Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 isequence_revision 15-Oct-1999 ttext_change 18-Feb-2000 

C;Accession: T29548 
R;Nhan, M,; Hawkins, J, 

submitted to the EMBL Data Library, February 1997 
A; Description: The sequence of C. elegans cosmid ZK377. 
A; Reference number; Z20639 
A; Accession: T29548 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type; DNA 
A;Residues: 1-874 <NHA> 

A;Cross-references: EMBL:U88183; PIDN:AAB52657.1; GSPDB :GN00028; CESP:ZK377.2 
A; Experimental source: strain Bristol N2; clone ZR377 
C; Genetics; 
A;Gene: CESP:ZK377.2 
A; Map position: X 

A;Introns: 91/2; 356/1; 452/1; 701/3; 746/3; 850/1 



Query Match 10.64; Score 790; DB 2; Length 874; 

Best Local Similarity 27.34; Pred. No, 1.3e-32; 

Matches 254; Conservative 133; Mismatches 377; Indels 166; Gaps 34; 

Qy 442 RPPPIIQIGPANQTLPRGSVATLPCRATGNPSPRIRWFHDGHAVQ-AGNRYSIIQGSSLR 500 

:||| I: I : Mil III llhhl hi I I II : :M II 
Db 27 RPPPTIEHGHQNQTLMVGSSAILPCQASGRPTPGISWLRDGLPIDITDSRISQHSTGSLH 86 

Oy 501 VDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTS-LHRAADPSTYPAPPGTPKVLN 559 

: lh I : I 111 I I lh:hhllll I : I III :h I I M 
Db 87 IADLKKPDTGVYTCIARNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPIIVN 146 

Qy 560 VSRTSISLRWARSQERPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGT 619 

h I : I I" : III II ::|:|'li I I h Ml I 
Db 147 VTDTEVELHW--NAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGLKPSH 204 

Qy 620 SYVFLVBAENTQGISVPSGLSNVIRTIEADFDAASAN— -DLSAARTLLTGKS-VELID 674 

lhh':|ll|.':|| II I :: I : I :: |:: I II : ::| : 
Db 205 SYMFVIRAENERGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEE 264 

Qy 675 ASAINASAVRLEHMLHVSADERYVEGLRIHYK-DASVPSAQYHSITVMDASAESFWGNL 733 

lh:MII I h ::| I :: II : I I Mil II 

Db 265 VKTINSTAVRLFWRRRRL- -EELIDGYYIRWRGPPRTNDNQY- -VNVTSPSTENYWSNL 320 

Qy 734 RKYTKYEFFLTPF--FETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTP 790 

:| Hlh h :| I llll II I II lh:::| Ml : I 
Db 321 MPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWRA 380 

Qy 791 PPSQHHNGNLYGYRIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAG 850 

I : III I:: II Ml II I :| II I :|: : : I 
Db 381 PKADGINGILRGFQI-VIVGQAPNNNRNITTNERAASVTLFHLVTGMTYKIRVAARSNGG 439 

Qy 851 DGPYSRPISLFMD-PTHHVHPPRAHPSGTHDGRHEGQDLTYH--NNGNIPPGDINPTTHR 907 

I Mill M : I I M 
Db 440 VGVSHGTSEVTMNQDTLEKHLA AQQENESFLYGLINKSHVP 480 

Qy 908 KTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTKELGHLSWSDNEITALNIN 967 

|:| : :|:: : |: |:: : ::| : :| 
Db 481 ■•VIVIVAILIIFWIIIAYCYWRNSRNSDGKDRSFIKINDGSVHMASNN 528 

Qy 968 SKESLW IDHHRGWRT ADTDKDSGLSESRLLSHVNSSQSNYNNSD - -GGT 1014 

II . : : Ml : : I I ::! M I II 
Db 529 — LWDVAQNPNQNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGS 5tt4 

Qy 1015 •■•■DYAEVvDTRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETCTKTTSISADKDSG 1C57 

lh: ::IH = 1 = 1 = 11111 
Db 585 EHHYHYAQLTGGPGNAMSTF YG - NQYHDDPSPYATTTLV 622 

Qy 1068 THSPYSDAFAGQVPAVPWRSNYLQYPVEPINWSEFLPPPPEHPPPSSTYGYAQGSPESS 1127 

: || : hi II llll lh : I I 
Db 623 LSNQQPA- -WLNDKMLRAPAMPTN PVPPE ■ • PPARYADHTAG - - RRS 663 

Qy 1128 RKSSKSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGT 1187 

I I II I; II :l :ll : II I Ml 

Db 664 RSSRASDGRG TLNGGLHHRTSGSQRS DSPPHTDVSYVQLHSSDGT 708 

Qy 1138 QNRYQITPTNQHPPQLPAY • FATTGPGGAVPP -NHL- PFATQRHAASEYQAGLNAARCAQ 1244 

: M 'Ml lllh Ihl II : 

Db 709 GSSKERTGERSTPPNKTLMDFIPPPPSNPPPPGGHVYDTATRRQ LNRGSTPR 760 

Qy 1245 SRACNSCDALATPSPMQPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSECSDHSR 1304 

: I :| : I h llll : I 

Db 761 EDTYDS-— ; VSDGAFARVDVNA- - -RPTSRNRNL- - -GGRPLKGKR 797 



1305 SSQSHKRQLQLEEHGSSAKQRGGHHRRRAP 1334 

I : I '!::: I h: I : I 
798 DDDSQRSSLMMDDDGGSSEADGENSEGDVP 827 
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C; Species: Fugu rubripes 

C;Date: 02-Sep-2000 tsequence_revision 02-Sep-2000 *text_change 02-Sep-2000 
C; Accession: T30532 

R;Riboldi Tunnicliffe, G.R.; Platzer, M.; Nyakatura, G.; Elgar, G.S.; Brenner, S.; Roser 
submitted to the EMBL Data Library, September 1997 
A; Description; Analysis of the genomic loci of Fugu rubripes homologs of the human dise£ 
A;Reference number: Z20848 
A;Accession: T30532 

A; States: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA ^ 
A; Residues: 1-1277 <RIB> 

A;Cross-references; EMBL:AF026198; NID:g3098263 ; PID:g3098264; PIDN:AAC15580.1 
C; Genetics: 

Ajintrons: 42/1; 47/1; 81/2; 149/1; 190/1; 247/1; 285/2; 347/1; 391/1; 440/1; 477/2; 531 
II 

A;Note: Ll-CAM 



Query Match 9.2%; Score 686; DB 2; Length 1277; 

Best Local Similarity 23.4%; Pred. No. 4e-27; 

Matches 324; Conservative 171; Mismatches 544; Indels 346; Gaps 51; 

M 15 TSTTNNPSRSRSSRMWLLPAHLLLVLVASNGLPAVR* - -GQYQS PRIIEHPTDLV 66 

^ I II : II II II! I I I I:: II III: 

Db 4 TQRQQGGSRGQWSRCLLL"LLLLPLAAQPGRAAIQIPSSYYISDLKIPPAITTQPESVT 61 

Qy 67 VRRNEPATLNCKVEGRPEPTIEWFRDGEPVSTNERRSHRVQFRDGALFFY- - -RTMQGKK 123 

I I : I: I I II I Mil : :| : |: II II I 
Db 62 VFSVEDLVMRCEASGNPSPIFHWTKDGEEFDPSSDPEMKVTEEAGSSVFYTLSNTMDTLK 121 

Qy 124 EQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAR GETALLECGPP 179 

: 1:1 I I I :l III I I I I I: I: :l : I : I I II 
Db 122 QYQ-GRYICYASNELGTAVSNAAVLMI — DAPPVQQKEKKVTEKAEAGHSIALSCNPP 176 

Qy 180 KGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGSLLISNVEPIDEGN-YKCIAQNL 238 

: :| : I: I : I 1 1 : 1 1 1 : 1 : I I I I I I 
Db 177 QSSMQPIIHWM DNRLRHIRLSDRVMVGRDGNLYFANLLTEDSRNDYTCNIQYL 229 

Qy 239 VGTRESSYAKLIVQVKPYFM- -REPRDQVM LYGQTATFHCSVGGDPPPR 285 

: : : | | : : : |:| I III I I I I II 

Db 230 ATRTILAKEPITLTVNPSNLVPRNRRPQMMRPTGSHSTYHALRGQTLELECIVQGLPTPK 289 

Qy 286 VLWKKEEGNIPVSRARILHD--ERSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHA 343 

I I :l :ll I :: h =11: :| I I I I I |: :: I I 

Db 290 VSWLRRDGE--MSESRISKDMFDRRLQFINISESDGGEYQCTAENVQGRTFHTYTVTVEA 347 

Qy 344 PPNFTKRPSNKRVGLNGWQLPCMASGNPPPSVFWTREGVSTLMFPNSSHGRQYVAADGT 403 

Ml:: 1:1 I M I:: III : I |: : |: 
Db 348 SPYWTNAPVSQLYAPGETVRLDCQADGIPSPTIIWTVNGVP-LSATSLEPRRSLTESGS 405 

• 404 LQITDVRQEDEGYYVCSAF — SWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPKG 459 
IMIIMI ::: :: I I MM: I :| 

Db 406 LILKDVIFGDTAIYQCQASNRHGTILANTNVYVI ELPPQILTENGNTYTFVEG 458 

Qy 460 SVATLPCRATGNPS PRIRWFHDG - HAVQAGNRYS IIQGSSLRVDDLQLSDSGT YTCTASG 518 

I I I 1:1 I:: I : I I ::: I : :: I I III I 
Db 459 QRALLECETFGSPRPKVTWESSSISLLLADPRVNLLTNGGLEIANVSHDDEGIYTCLVQG 518 

Qy 519 ERGETS WAAT 528 

: hi: I 
Db 519 SNISVNAEVEVLNRTVILSPPQALRLQPGKTAIFTCLYVTDPRLSSPLLQWRKNDQKIFE 578 

Qy 529 LTVERPG STSLHRAADPSTY PAPPGTPRVLN 559 

I : II II II III:! 

Db 579 SHSDRRYTFDGPGLIISNVEPGDEGVYTCQIITRLDMVEASSTLTLCDRPDPPVHLQVTN 638 

Qy 560 VSRTSISLRWARSQEKPGAVGPIIGYTVEYFSPDL-QTGWIVAAHRVGDTQVTISGLTPG 618 

::| I : II: I II: |: : II I : II 

Db 639 AKHRWTLNWTPGDDNN---SPILEYWEFEDQDMKENGWEELKRVAADKKHVNLPLWPY 695 

Qy 619 TSYVFLVRAENTQGISVPSGLSNVIRTIEADFDAASANDLSAARTLLTGKSVELIDASAI 678 

II I I I I II I I! II:: II II :: I: : I I : 

Db 696 MSYRFRVI AINDQGKSDPSRLSDLYRT - PADAPDSNPEDVRSEST DPDTL 744 



Qy 679 KASAVRLEWMLHVSADERYVEGLRIHYRDASVPSAQYHSITVMDASAESFWGNLKKYTR 738 

: :: ■: | ||: : :: :| : I |:| ::: 

Db 745 VITWEEMDRRHFNG PDFRYL VMWRRWGSGPDWHEEYT I • • • APPFIVTDVQNFSA 797 

Qy 739 YEFFLTPFFETIEGQPSNSR T ALT YEDVPSAPPDNIQ IGMYNQT AGWYRWT 789 

:| h III : Mil I |: : : I I II: 

Db 798 FE -IKVQAVKRRGLGPEPDPIIGYSGEDVPLEAPLNLGVLLENSTTIRVTWS 848 

Qy 790 PPPSQHHNGNIYGYRIEVSAGNTMRVLANMTLN ATTTSVLLNNLTTGAVYSVR 842 

: Ml Mil :: |: I I I : II I : 
Db 849 AVDKETVRGHLLGYRIYLTWGHHRNSRAQEPENIVMVQTGANEERKSITNLRPYCHYDLA 908 

Qy 843 LNSFTKAGDGPYSRPISLFMDPTHHVHPPRA — HPSGTHDGRHEGQDLTYHNNGNIPP 898 

:::| I : II |: I II I II : II : ::| I I 

Db 909 ISAFNSKGEGPLSEKTS-FMTPEGVPGPPMSMQMTSPSES EITLHWT— PP 956 

Qy 899 GDINPTTHRRTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRRHQMTRELGHLSW-- 956 

I ! I : I : =1 : I II 

Db 957 SRPN • GILLGYSLQYRKMQSDDNPLQWDI 985 

Qy 957 SDNEITALNIUSKESLWIDHHRGWR- —TADTDRDSGLSESRL LSHVNSSQSN 1006 

: II! I :.; :l I :: I I III I I I I I 
Db 986 ASPEITHLTLKG LDRHSHYQFLLMARTAAGRGLSIEILGATTLEGLPPANISLS- 1039 

Qy 1007 YNNSDGGTDYAEVDTRNLT '- - - -TFYNCRKSPDNPTPYATTMIIGTS 1049 

. II : II: f : I : I :: :| 

Db 1040 AEERSVNLSWEARRRHRTVGFQIHYFSRNGTRNGGRWKRTEMVNSSLQFF 1089 

Qy 1050 - SSETCTKTTSISADKDSGTHSPYSDAFAGQVPAVPW 1086 

'; :| II II:: : I : I I I II :: 

Db 1090 QLOGLTPGSHYRLLFTYKNNTFWETEIQTRGTSVTEVQPSFATQGW— FIGWSAV7LL 1146 

Qy 1087 RSNYLQYPVEPINWSEFLPPPPE-HPPPSSTYGYAQGSPESSRRSSKSA 1134 

■ =1 Ml: I I I I 1:1 : I II : ::| 
Db 1147 LLVLLILCFIKRSRGGRYSVKD— REDGPMDSEARPMRDETFGEYR-SLESDLEERRTA 1202 

Qy 1135 i GSGISINQSILNASIHSSSSGGFSAWGVSPQYAVACPPEN 1174 

II I : :: I: I I : I I : I |:| 
Db 1203 SQPSLGEDSRLCSEDNLDFNGSSAVTTELNMDESLASQFSR- • HSEGPEPFHGV- - - PDN 1257 

Qy 1175 VYSNP 1179 

II 

Db 1258 SPLNP 1262 



150600 

neogenin ■ chicken (fragment) 
C; Species: Gallus gallus (chicken) 

C;Date: 13-Sep-1996 tsequence.revision 13-Sep-1996 (ttext_change 21-M-2000 
C;Accession: 150600''. 

R;Vielmetter, J.; Rayyem, J.F.; Roman, J.M.; Dreyer, W.J. 
J. Cell Biol. 127, 2009-2020, 1994 

A; Title: Neogenin, an avian cell surface protein expressed during terminal . n .e 
A;Reference number: A55193; MOID: 95105243 
Accession: 150600 ; 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-1443 <VIE> 

A; Cross -references: EMBL:O07644; NID:g641965; PIDN: AAC59662 .1; PID:g641966 



Query Match 9.1%; Score 677; DB 2; Length 1443; 

Best Local Similarity 23.1%; Pred. No. 1.4e-26; 

Matches 304; Conservative 171; Mismatches 541; Indels 302; Gaps 

3y 157 FRVEPRDTRVARGETALLECG PPRGIPEPTLIWIRDGVPLDDLRAMSFGASSRVR 211 

I III I M : :: I III : I III :: : I : 

Db 21 FLVEPMDILSVRGASVIMNCSSYCETPPR IEWRRDGT LLNLVSDDRRQ 68 



212 IVDGGNLLISNV" 
:: IMII::| 



■ - EP IDEGNYKCI AQ - NLVGTRESSYARLI VQVRPY FMREPKDQ 264 
:| III hIM M: I III I II :|: 
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Db 69 LLPDGSLLINSWHSKHNKP-DEGYYQCVATVESLGSIVSRTAKLTVAGLPRFTSQPELS 127 

Qy 265 VMLYGQTATFHCSVGGDPPPKVLWKREEGNIPVSRARILHDEKSLEISNITPTDEGTYVC 324 

: I :| :| I I I |::: : : :| I I I II I I I 

Db 128 SVYKGNSAILNCEVNVDLAPFVRWEODRQPLSLDDRVFKLPSGALLIGHATDTDGGFYRC 187 

Oy 325 EAHN-NVGQISARASLIVHAPPN FTKRPSN- - KKVGLNGWQLPCMASGNPPPS 375 

: : I I I : I I ::lh I I I I Ihl I I I 
Db 188 VIESGGTPRYSEEAELRILPDPEEPQSLVFV1QPSSLTKVTGQNAV--FPCVAGGFPTPY 245 

Oy 376 VFWTKEGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQ 435 

I III I : I : I hi 1:11 :|| I I I : |: : I 
Db 246 VRWTRNGEELI- * -TEDSERFALRAGGSLLISDVTEEDVGTYTC IADNENETIEAQ 298 

Oy 436 VSSVDERPPPIIQIGPANOTLPKGSVMLPCRATGNPSPRIKWFHDGHAVQAGNRYSIIQ 495 

: II III : I II hi :|| :| I : : h: 
Db 299 AELAVQVPPEFLK-RPANIYAHESMDIVFECEVTGKPTPTVKWVKNGDWIPSDYFKIVK 357 

•496 GSSLRVDDLQLSDSGTYTCTASGERGETSWAATL TVEKPGSTSLHRAAD — 544 
:hl 1111111:1 II . : III I: 
Db 358 EHNLQVLGLVKSDEGFYQCIAENDVGNAOAGAQLIILDLDVAIPTLPPTSLTSMNDHLA 417 

Qy 545 PSTYPAPPGTPK— VLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVA 601 

hi I I: II I II I I : h: I : : 
Db 418 PATTGPLPTAPRDWATLVSTRFIRLTWRTPVSDP--QGDNLTYSIFYTKEGINRERVEN 475 

Oy 602 AHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSAA 661 

I hill I I I I III I hi l[ I I I : I 
Db 476 TSRPGETQVMIONLMPETVYVFRWAQNKHlJjGESSAPLKVATQPEVQLPGPAPN 530 

Qy 662 RTLLTGKSVELIDASAINASAVRLEWMLHV^ADEKYVEGLRIHYKDASVPSAQYHSITVM 721 

I I : ::| : :!:::: :::| : I I : 
Db 531 IRAYAGSPTSVTVTWETPL33NGE-IQNYRLYYMEKGQDSEQ DV 573 

Qy 722 DASAESFWGNLKKYTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQ 781 

I : I: : llllhl I : : : I : I Mil | |: : I 
Db 574 DVAGLSYTITGLKKYTEYSFRWAYNKHGPGVSTQDVWRTLSDVPSAAPQNLTLEARNS 633 

Qy 782 TAGWVRWTPPPSQHHNGNLYGYKI- - "EVSAGNTMKVLANMTLNATTTSVLLNNLTTGAV 838 

: : I III: hi : III :ll I :: | |: I I 
Db 634 RSIMLHWQPPPAGTHSGQITGYRIRYRRVS RKSDVTESVGGTQLFQLI EGLERGTE 689 

Qy 839 YSVRLNSFTKAGDGPYSKPIS - • LFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNI 896 

I: I: : I I II : :| i I II I 
Db 1690 YNFRIAAMTVNGTGPATDWVSAETFESDLDESRVPEV-PSSLH 731 

• 897 PPGDINPTTHK^TDYLSGPWLMVLVCIVLLVLVISAAISMVYFRRKHQMTK-ELGHLS 954 
: I II I :| : :: : : :|: 

Db 732 — -VRP LVTSIWSWTPPENQNIWRGYAIGY-- 760 

Qy 955 WSDNEITALNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGG- 1013 

: ::: :|: : : I : I II : :|| | 

Db 761 GIGSPHAQTIKVDYKQRYYTIENLDPS SHYVITLKAFNNVGEGI 804 

Qy 1014 TDYAEVDTRNLTTFYNCRKSPDNPTPYATTMI-'IGTSSSEICTKTTSI 1060 

:| :IM : :| I I : h :| :| II 

Db 805 PLYESAVTRPHSDTSEVDLFVI NAPYTPVPDPSPMMPPVGVQASILSHDTIRI 857 

Qy 1061 S-ADKDSGTHSPYSDA FAGQVPAVPWKS NYLQYPVEPINWSEF — 1103 

: II : :M : :|l h =11 ::| II 
Db 858 TWADNSLPKNQRITDARYYTVRWRTNIPANTRYRTANATTLSYLVTGLRPNTLYEFSVMV 917 

Oy 1104 LPPPPEHPPPSSTYGYAQGSPES SRKSSKSAGSGI 1138 

I II I :| I : 1:111 
Db 918 IKGRRSSIWSMTAHGTTFELVPTSPPKDVTWSREGKPRTIIVNWQPPSEANGKITGYII 977 

Oy 1139 --STNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGIQMQITPT 1196 

lh :|| II I : I I I h '-II 
Db 978 YYSTD — VNAEIHD WVIEP WGNRLT HQIQEL 1007 

Qy 1197 NQHPPQLPAYFATTGPGGAVPPNHLPFAIQRHAASEYQAGLNAARCAQSRACNSCDALAT 1256 

I II : I I : :h : :| I : 

Db 1008 TLDTPYYFKIQARNSKGMGPMSEAVQFRTPKAESSD KMPNDQASGSAGKGSR 1059 



Qy 1257 PSPMQP--PPPVPVPEGKYQPVHPNSHPMHPTSSNHQIYQCSSE CS 1300 

I : I lh I: I I II : I h 

Db 1060 PVDVGPDYKPP.LSGSNS PHGSPTSPLDSNMLLVIIVSVGVITIVIWIVAVFCT 1113 

Qy 1301 DHSRS SQS HKRQLQLEEHGSSAKQRGG HHRR RAPWQPCM 1340 

: I I II :|| I :| II I ::| I I 

Db 1114 RRTTSHQKKRRAACKSVNGSH-RYKGNSRDVKPPDLWIHHERLELRPIDKSPDPNPIM 1170 



RESULT 9 
T08851 

Down syndrome cell adhesion protein 1 - human (fragment) 
N; Alternate names: Down syndrome cell adhesion molecule 
C; Species: Homo sapiens (man) 

C;Date: ll-Jun-1999 tsequencejrevision ll-Jun-1999 Stext.change ll-Jun-1999 
C; Accession: T08851 . 

R;Yamakawa, K,; Huo,-.,Y.K.; Haendel, M.A.; Hubert, R.; Chen, X.N.; Lyons, G.E.; Korenb 
submitted to the EMBl Data Library, September 1997 

A; Description: DSCAM: A novel member of the immunoglobulin superfamily maps in a dowi, 
A; Reference number: Z16495 
A; Accession: T08851 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-1896 <YAM> 

A; Cross -references: EMBL:AF023449; NID:g3169765; PID:g3169766 

A; Experimental source: brain; developmental stage: 14 weeks; fetal 

C; Genetics: 

A;Gene: DSCAM 

A; Map position: 21q22 

A;Note: derived from alternately-spliced mRNA 
C; Function: 

A; Description: involved in nervous system development 
C;Reywords: alternative splicing 



Query Match 8.9%; Score 661; DB 2; Length 1896; 

Best Local Similarity 24.21; Pred. No. 1.2e-25; 

Matches 305; Conservative 175; Mismatches 473; Indels 308; Gaps 60; 

Qy 55 SPRII EHPTDLWKKNEPATLNCKVEGKPEPT IEWFKDGEPVSTNEKKS HRVQ - - "FRDG 111 

:hll ::.ll II :l I hi Mill I :h : III: :l 
Db 392 TPKIISAFSEKWSPAEPVSLMCNVKGTPLPTITWTLDDDPIL--KGGSHRISQMITSEG 449 

Qy 112 ALFFYRTMQGKKEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEP - KDTRVAKGE 170 

: I : '■: :|ll I I I I II I : : I : I h I 
Db 450 NWSYLNISSSQVRDGGVYRCTANNSAG-WLYOARINV---RGPASIRPMKNITAIAGR 505 

Qy 171 TALLECGPPRGIPEPTLIMIRDG-VPLDDLRAMSFGASSRVRIVDGGNLLISNVE-PID 227 

: I M :: I I: :| : :| : I I :|:|: :| 

Db 506 DTYIHC-RVIGYPYYSIKWYKNSNLLPFN HRQVAFENNGI LKLSDVQKEVD 555 

Qy 228 EGNYKC - - I AQNLVGTRESS YAKLI VQVKP YF - - MKEPKDQVMLYGQTATFHC - SVGGDP 282 

II I I : I : I :| : : 1:1 I: : |: : || III 
Db 556 EGEYTCNVLVQPQLSTSQSVH- -VTVKVPPFIQPFEFPRFSI- - -GQRVFIPCVWSGDL 610 

Qy 283 PPRVLWRREEGNIPVSRARILHD— ERSLEISN1TPTDEGIYVCEAHNNVGQISARASL 339 

I : hh MM : : II III:: Mill : :: I 

Db 611 PITITWQKDGRPIPGSLGVTIDNIDFTSSLRISNLSLMHNGNYTCIARNEAAAVEHQSQL 670 

Qy 340 IVHAPPNFIKRPSNRRVGLNG - WQLPCMASGNPPPSVFWT - KEGVSTLMF - PNS SHGRQ 396 

II II I :'l :: h I I M I I I hM :l II : :ll 

Db 671 IVRVPPKFWOPRDQD-GIYGKAVILNCSAEGYPVPTIVWKFSRGAGVPQFQPIALNGRI 729 

Qy 397 YVAADGTLQITDVROEDEGYYVC SAFSWDSSTVR - VFLQVSS VDE RPPPI IQ IG PANQT 455 

I ::hl I ■ I =11 llhl : M : ::| I : Ml I 
Db 730 QVLSNGSLLRHWEEDSGYYLCRVSNDVGADVSKSMYLTV KIPAMITSYPNTTL 784 

Oy 456 LPKGSVATLPCRATGNPSPRIRWFHDGHAVOAG-NRYSIIQG SSLRVDDLQLS 507 

:| MM ::| : : IM hh: 
Db 785 ATQGQKKEMSCTAHGEKPIIVRWEKEDRIINPEMARYLVSTKEVGEEVISTLQILPTVRE 844 

Qy 508 DSGTYTCIASGERGETSWAATLTVEKPGSTSLHRAADPSTYPAPPGTPKVLNVSRTSISL 567 



Mon Jan 22 13:04:18 2001 



us-09-540-245a-15.rpr 



Page 8 



Db 



III ::| I II III::! 
845 DSGFFSCHAI NSYGEDRG I IQLTVQEP - - 



I II :: :| :|:| 
■-PDPPEI-EIKDVKARTITL 8 



0y 568 RWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGD TQVTISGLTPGTSYV 622 

II : II II :| : I :| I I II : I 

Db 890 RWTMGFD- - -GNSPITGYDIE- -CKNKSDSW-DSAQSTKDVSPQLNSATIIDIHPSSTYS 943 

Qy 623 FLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSAARTLLTGKSVELIDASAINASA 682 

: hi I II II : II II :ll I I : I:: : 

Db 944 IRMYAKNRIGKSEP— SNEL-TITAD-EMP DGPPQE-VHLEPISSQS 986 

0y 683 VRLEWMLHVSADEKY VEGLRIHYKDASVPSAQYHS ITVMDASAES • ■ FWGNLKK 735 

:|: I I :|: : I :| |:: I :| :| I :| : : II I 
Db 987 IRVTW — KAPKKHLQNGI I RGYQ IGYREYSTGGNFQFNI ISVDTSGDSEVYT LDNLNK 1042 

Qy 736 YTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQH 795 

:|: : III lllll ||:|:| : : : |: : 

Db 1043 FTQYGLWQACNRAGTGPSSQEIITTTLEDVPSYPPENVQAIATSPESISISWSTLSKEA 1102 



796 HNGNLYGYKIEVSAGNTMKVLANMTLNAlTT—SVLLNNLTTGAVySVRLNSFTRAGDGP 853 

II I : I I I III |: |: I II::: :ll:llll 

1103 LNGILQGFRV-IYWANLMDGELGEIKNITTTQPSLELDGLEKYTNYSIQVLAFTRAGDGV 1161 



854 YSKPISLFMDPIHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTTHKKTTDYL 913 
I: I I I I hi 
Db 1162 RSEQI- -FTRTKEDVPGP — PAG 1180 

Qy 914 SGPWLMVLVCIVLLVLVISAAISMVYFK RKHQMTKELGHLSWSDNE • • 960 

I :|: III: II: : : :|:|: I 

Db 1181 VKAAAASASMVFVSWLPPLKLNG I IRKYTVFCSHPYPTVI SEFEAS 1226 

Oy 961 itaLninskeslwidhhrgwrtadtdkdsglseskllshvnssqsnynnsdgg 1013 

I I,: I : hi: I I I |:|: 
Db 1227 PDSFSYRIPNtSRNRQYSVWV VAVTSAGRG NSSEII 1262 

Qy 1014 T--DYAEVDTRNLT TFYNCRKSPDNPTPYATTM- • IIGTSSSETCTKT 1057 

I I: I II II: :|:l I II I I 

Db 1263 TVEPLAKAPARILTFSGTVTTPWMKDIVLPC-KAVGDPSPAVKWMKDSNGTPSLVTIDGR 1321 

Qy 1058 TSISAD KDSGTHSPYSDAFAGQVPAVPWKSNYLQYPVEPINWSE 1102 

II :: :MI :| : :| II 
Db 1322 RSIFSNGSFIIRTVKAEDSGYYS CIANN NWGSDEIIL 1358 

Qy 1103 --FLPPPPEHP PPSSTYGYAQGSPESSRKSSKSAGSGISTNQSILNASIHSS 1152 

: II: I I I : I Mill: 
Db 1359 NLQVQVPPDQPRLTVSKTTSSSITLSWLPGD NGGSSIRGYILQYSEDNS 1407 

Qy 1^53 SSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPTNQHPPQLPAYFATTGP 1212 
III I I I I : II :: :| I II 

• 1408 EQ WGSFP 7 - ■ -ISPSERSYR- - LENLKCGTWY KFTLT AQN GVGP 1445 
1213 G 1213 

I 

Db 1446 G 1446 



RESULT 10 
A34695 

axonal glycoprotein TAG-1 precursor - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 29-Jun-1990 *sequence_revision 29-Jun-1990 Utext.change 21-Jan-2000 
C; Accession: A34695 

RjFurley, A.J.; Morton, S.B.; Manalo, D.; Raragogeos, D.; Dodd, J.; Jessell, T.M, 
Cell 61, 157-170, 1990 
A; Title: The axonal glycoprotein TAG-1 is an immunoglobulin superfamily member with neui 
A;Reference number: A34695; MUID: 90199890 
A; Accession: A34695 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-1040 <FDR> 

A;Cross-references: GB:M31725; NID:g207148; PIDN:AAA42201,1; PID:g207149 

C; Super family: contactin; fibronectin type III repeat homology; immunoglobulin homology 

C; Keywords ; glycoprotein 



F;343-399/Domain: immunoglobulin homology <IMM> 



Query Match 8.9%; Score 658; DB 2; Length 1040; 

Best Local similarity 24.1*; Pre! No. 8 .le-26; 

Matches 251; Conservative 145; Mismatches 433; Indels 214; Gaps 36; 

Qy 22 SRSRSSRMWLLPAWLLLVLVA SNGLPAVRGQYQSPRIIEHPTDLWKK* - -NEPAT 74 

:| ::| : I: I : II : I II I I III:: : I 
Db 5 ARKKASLLLLVLATVALVSSPGWSFAQGTPATFG— -PIFEEQPIGLLFPEESAEDQVT 60 

Qy 75 LNCKVEGKPEPTIEWFKDGEPVSTNEKKSHRVQFRDGALFFYRTMQGKKEQDGGEYWCVA 134 

II: Ml: I : I I I I I I II I I hi 
Db 61 LACRARASPPATYRWKMNG- -TDMNLEPGSRHQLMGGNLVI- • -MSPTKTQDAGVYQCLA 115 

Qy 135 KNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETALLECGPPKGIPEPTLIWI — 190 

I II 1 1 : I h h: : I : :| :| I II I : h 

Db 116 SNPVGTWSKEAVLRFGFLQEFSKEERDPVKTHEGWGVMLPCNPPAHYPGLSYRWLLNEF 175 

Qy 191 KDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKCIAQNLV--GTRE--SSY 24 j 

: :l I ■:! Ill h I III hi : : h I : 

Db 176 PNFIPTDGRHFVS QTTGNLYIARTNASDLGNYSCLATSHMDFSTKSVFSKF 226 

Qy 247 AKLIV -— QVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWKKEEGNIP 296 

1:1 : : :l I I : I II I I hi h: hi :|:: 
Db 227 AQLNLAAEDPRLFAPSIKARF- ■ -PPETYALVGQQVTLECFAFGNPVPRIKWRKVDGSLS 283 

Qy 297 VSRARILHDEKSLEISNITPTDEGTYVCEAHNHVGQISARASLIVHAPPNFTKRPSNKKV 356 

I I :hl ::: lllll III h h : : :|| I I : I h : 
Db 284 PQWATA— EPTLQIPSVSFEDEGTYECEAENSKGRDTVQGRIIVQAQPEWLKVISDTEA 340 

Qy 357 GLNGWQLPOIASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGY 416 

: :: I hi I I I I : I I : I I I I I I : : : 1 1 I 
Db 341 DIGSNLRWGCAAAGKPRPMVRWLRNGE PLASQNRVEVLA-GDLRFSKLSLEDSGM 394 

Qy 417 YVCSAFSWDS--STVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSP 474 

I I I : i :: : :| : I I I: :: II :| :: h I 
Db 395 YQCVAENKHGTIYASAELAVQALAPDFRQNPVRRLIPA — ARGGEISILCQPRAAPKA 450 

Qy 475 RIKWFHDGHAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVE-- 532 

II : I :: :| : :: II I III I h : hi 

Db 451 TILWSKGTEIIjGNSTRVTVTSDGTLIIRNISRSDEGKYTCFAENFMGKANSTGILSVRDA 510 

Qy 533 KPGSTSL HRAADPS— TY 548 

I I ■: I : II: h 

Db 511 TRITLAPSSApINVGDHLTLQCHASHDPTMDLTFTWTLDDFPIDFDKPGGHYRRASAKET 570 

Qy 549 PAPPGTPKVLNVSRTSISLRW 569 

I III I :: h: I I 
Db 571 IGDLTILNAWRHGGKYTCMAQTWDGTSKEATVLVRGPPGPPGGVWRDIGDTTVQLSW 630 

Qy 570 AKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRV — GDTOVT - ISGLTPGTSYVFLV 625 

:: : II lh: :l I : I h : : II I I I I 
Db 631 SRGFDNH- - -SPIAKYTLQARTPPSGKWKQVRTNPVNIEGNAETAQVLGLMPWMDYEFRV 687 

Qy 626 RAENTQGISVPSGLSNVIKTIEADFDAASANDLSAARTLLTGKSVELIDASAINASAVRL 685 

I I I III I: hi II : : : II I III : 
Db 688 SASNILGTGEPSGPSSKIRTKEA-VPSVAPSGLSGG — GGAPGELI I 731 

Qy 686 EWMLHYSADEKYVEGLRIHYKDASVPSAQYHSITVMDASAESFWGN-LKKYTKYEFFL 743 

I II : : :| I : : : | | |: 1 1 1 1 : : || :| : 

Db 732 NW-TPVSREYQNGDGFGYLLSFRRQGSSSWQTARVPGADAQYFVYGNDSIQPYTPFEVKI 790 

Qy 744 TPFFETIEGQPSNSKTALTY-EDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLY 801 

: I lllll 1:1 -I : : : I I I I II I 
Db 791 RSY-NRRGDGPESLTALVYSAEEEPRVAPAKVWAKGSSSSEMNVSW-EPVLQDMNGILL 847 

Qy 802 GYKIEV-SAGRTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISL 860 

lh! lh' : 11:1 I I : :: :M III 
Db 848 GYEIRYWKAGDNEAAADRVRTAGLDTSARVTGLNPNTKYHVTVRAYNRAGTGPASPSADA 907 

Qy 



861 FMDPTHHVHPPRAHPSGT" 
II! II 



■-HDGRHEGQDLTYHNNGNIPPG 899 
:: I : I I 
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Db 908 MT VRPPPRRPPGNISWTFSSSSLSLKWDPVVPLRNESTVTGYKMLYQN-- 

Qy 900 DINPTTHKKTTDYLSGPWLMVLV 922 

I I I: : I 
Db 956 DLHPT— PTLHLTSKNWIEIPV 975 



RESULT 11 
158164 

BIG-1 protein - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Datf : 26-M-1996 frsequence_revision 26-Jul-1996 ttext_change 24-Sep-1999 
C; Accession: 158164 

R;Yoshihara, Y.; KawasUki, M.; Tani, A,; Tamada, A.; Nagata, S.; Kagamiyama, H.; Mori, 
Neuron 13, 415-426, 1994 

A; Title: BIG-1: a new TAG-1/F3 -related member of the immunoglobulin superfamily with nei 
A;Reference number: 158164; MUID: 94338697 
^Accession: 158164 

• status: preliminary; translated from GB/EMBL/DDBJ 
W; Molecule type: mRNA 
A;Residues: 1-1028 <RES> 

A;Cross-references: EMBL:U11031; NID:g563132; PIDN:AAA63607 .1; PID:g563133 
C; Genetics: 
A; Gene: BIG-1 

C; Superfamily: contactin; fibronectin type III repeat homology; immunoglobulin homology 



Query Match 8.7*; Score 647; DB 2; Length 1028; 

Best Local Similarity 23.5%; Pred. No. 2.9e-25; 

Matches 230; Conservative 139; Mismatches 401; Indels 208; Gaps 28; 

Qy 31 LLPAWLLLVLVASNGLPAVRGQYQSPRIIEHPTDLW- - -RKNEPATLNCKVEGRPEPTI 87 

:: :| hM II I I :: |:: : ::: ill: I I I 
Db 1 MMLSWKQLILLSFIGCLAGELLLQGPVFVKEPSNSIFPVGSEDKKITLNCEARGNPSPHY 60 

Qy 88 EWFKDGEPVSTNEKKSHRVQFKDGALFFYRTMQGRKEQDGGEYWCVAKNRVGQAVSRHAS 147 

1:1:1: II : I I I I I I I I : I 1 1 1 I 

Db 61 RWQLNGSDIDTS - - LDHRYKLNGGNLI - - ■ VINPNRNWDTGSYQCFATNSLGTIVSREAK 115 

Qy 148 LQIAVLRDDFRVEPRD-TRVAKGETALLECGPPKGIPEPTLIWIKDGVPLDDLRAMSFGA 206 

II I I ::h : I :h :l III I : |: : I I 
Db 116 LQFAYL-ENFRSRMRSRVSVREGQGVVLLCGPPPHSGELSYAWVFNEYP SFVE 167 



c 



207 SSRVRIV--DGGNLLISNVEPIDEGNYKCIAQNLVGTRESSYARLIVQVRPYFMR 259 

II : 1:1 : III Mill:: : ::: I :: 
168 EDSRRFVSQETGHLYIARVEPSDVGNYTCWTSTV TNARVLGSPTPLVLRSDGVM 222 



260 — EPKDQVML YGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDERSL 309 
III :: III hi |:: |:: :| :: :: I 

Db 223 GEYEPKIELQFPETLPAAKGSTVKLECFALGNPVPQINKRRSDGMPFPTKIKLRKFNGVL 282 

Qy 310 EISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFIKRPSNKKVGLNGWQLPCMAS 369 

III I 1:1 I M: I: II I :| I :::::: Ml 
Db 283 EIPNFQQEDTGSYECIAENSRGKNVARGRLTYYAKPYWVQLLKDVETAVEDSLYWECRAS 342 

Qy 370 GNPPPSVFWTKEGVSTLM FPNSSHGRQYVAA — 400 

I I II I I M :: Ml Ml 

Db 343 GKPKPSYRWLKNGDALV1EERIQIENGALTIANLNVSDSGMFQCIAENKHGLIYSSAELK 402 

Qy 401 DG 402 

II 

Db 403 VLASAPDFSRNPMKKMIQVQVGSLVILDCKPSASPRALSFWKKGDTWREQARISLLNDG 462 

Qy 403 TLQITDVRQEDEGYYVCSA- - -FSVVDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPRG 459 

M :M I I I I I I : :| I : I M hi : I 

Db 463 GLKIMNVTKADAGIYTCIAENQFGKANGTTQLWTE PTRIILAPSNMDVAVG 514 

Qy 460 SVATLPCRATGNPSPRI-KWFHDGHAV--QAGNRYSIIQGSS— LRVDDLQLSDSGT 511 

III: :| I h :| : h : : III I : ::|| II 
Db 515 ESIILPCQVQHDPLLDIMFAWYFNGTLTDFKKDGSHFEKVGGSSSGDLMIRNIQLRHSGK 574 

Qy 512 YTCTASGERGETSWAATLTVEKPGSTSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAK 571 

II MM I II III II :M: II : 



Db 575 YVCMVQTGVDSVSSAAELIVR- -GS PGPPENVKVDEITDTTAQLSWTE 620 

Qy 572 SQEKPGAVGPI IG YTVEYFSPDLQTGW IVAAHRVGDTQ - VT I SGLT PGTSYVFLVR 626 

: : hi I I: : 1 1 I I |: |: I I Ml 
Db 621 GTD— SHSPVISYAVQARTP-FSVGWQNVRTVPEAIDGRTRTATWELNPWVEYEFRW 676 

Qy 627 AENTQGISVPSGLSNVIKTIEADFDAA-— SANDLSAARTLLTGRSV--ELIDASAINA 680 

I I I 1:1 I ::| II : I I I : ::| MM 
Db 677 ASNKIGGGEPSLPSERVRTEEAAPEVAPSEVSGGGGSRSELVITWDPVPEELQNGGGF-- 734 

Qy 681 SAVRLEWMLHVSADERYVEGLRIHYRDASVPSAQYHSITVMDASAESFWGNLKRYTKYE 740 

I : :: M :| I I :: :: II 
Db 735 GYWAFRPLGVTTWIQTWTSPDNPRYVFRNESIVPFSPYE 775 

Qy 741 FFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNL 800 

: : ' II I I : hh Ml : : II Mill 
Db 776 VKVGVYNNKGEGPFSPVTTVFSAEEEPTVAPSHISAHSLSSSEIEVSWNTIPWKSSNGRL 835 

Qy 801 YGYRI EVSAGNTMRVLAMLNATTTSVLLNNLTTGAVYSVRLNSFTRAGDGP 853 

lh: ■ I : : :|| I IMI M I : :: II II 
Db 836 LGYEVRYWNNGGEEESSSKVKVAGNQ TSAVLRGLKSNLAYYTAVRAYNTAGAGP 889 

Qy 854 YSKPISLFMDPTHHVHPP 871 

:| :: '• I II 
Db 890 FSATVNATTRRTPPSQPP 907 



RESULT 12 
A53449 

plasmacytoma -associated neuronal glycoprotein PANG - mouse 
C;Species: Mus musculus (house mouse) 

C;Date: 25-Aug-1995 isequence_revision 25-Aug-1995 ttext„change 24-Sep-1999 
C; Accession: A53449 : 

R;Connelly, M.A.; Grady, R.C.; Mushinski, J.F.; Marcu, R.B. 
Proc. Natl. Acad. ScL U.S.A. 91, 1337-1341, 1994 

A;Title: PANG, a gene encoding a neuronal glycoprotein, is ectopically activated by i 

A;Reference number: A53449; MUID: 94151325 

A;Accession: A53449 ; 

A; Status: preliminary 

A; Molecule type; mRNA 

A;Residues: 1-1028 <CON> 

A;Cross-references: GB:L01991; NID:g200056; PIDN:AAA17403.1; PID:g200057 

C; Superfamily: contactin; fibronectin type III repeat homology; immunoglobulin homolo 

C; Keywords: glycoprotein 



Query Match 8.71; Score 645; DB 2; Length 1028; 

Best Local Similarity 23.4%; Pred. No. 3.6e-25; 

Matches 228; Conservative 142; Mismatches 403; Indels 202; Gaps 27; 

Qy 31 LLPAWLLLVIiVASNGLPAVRGQYQSPRIIEHPTDLW— KKNEPATLNCKVEGKPEPTI 87 

:: :| hh: | | I I |: |:: : ::: III!: | | I 
Db 1 MMLSWRQLIIiLSFIGCLAGELLLQGPVFIKEPSNSIFPVDSEDRKITLNCEARGNPSPHY 60 

Qy 88 EWFKDGEPVSTNERKSHRVQFRDGALFFYRTMQGRKEQDGGEYWCVAKNRVGQAVSRHAS 147 

I :l :M IM II : : I I I I I Ml MM 
Db 61 RWQLNGSDIDTS- -LDHRYRLNGGNLI---VINPNRNWDTGSYQCFATNSLGTIVSREAK 115 

Qy 148 LQIAVLRDDFRVEPKDT-RVAKGETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGA 205 

II I I ::h : I Mi: :| ill I : i: : II 

Db 116 LQFAYL-ENFKTRMRSTVSVREGQGWLLCGPPPHSGELSYAWVFNEYP SFVE 167 

Qy 207 SSRVRIV--D3GNLLISNVEPIDEGNYKCIAQNLVGTRESSYAKLIVQVKPYFMK 259 

1:1 :■ hi I: III Mill:: : ::: | :: 
Db 168 EDSRRLVSQEIGHLYIAKVEPSDVGNYTCWTSTV TNTRVLGSPTPLVLRSDGVM 222 

Qy 260 —EPKDQVML YGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDEKSL 309 

III :| III hi h: h: :| :: :: I 

Db 223 GEYEPKIEVQFPETLPAAKGSTVRLECFALGNPVPQINWRRSDGMPFPNKIKLRRFNGML 282 

Qy 310 EISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPSNKKVGLNGWQLPCMAS 369 

III I hi I h h II I :| M :::::: I II 

Db 283 EIQNFQQEDTGSYEGIAENSRGKNVARGRLTYYARPYWLQLLRDVEIAVEDSLYWECRAS 342 
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Qy 370 GNPPPSVFWTKEGVSTLM FPNSSHGRQYVAA — 400 

I I II I I I : : II I :| 

Db 343 GKPKPSYRWLKNGDALVLEERIQIENGALTITNLNVTDSGMFQCIAENKHGLIYSSAELK 402 

Qy 401 DG 402 

II 

Db 403 WASAPDFSKNPMKKMVQVQV6SLVILDCKPRASPRALSFWKKGDMMVREQARVSFLNDG 462 

Qy 403 TLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVA 462 

hi :| : I I I 1:1 : : I h I I : hi : I 
Db 463 GLKIMNVTKADAGTYTCTAENQFGKANGTTHLWTE PTRI ILAPSNMDVAVGESV 517 

Qy 463 TLPCRATGNPSPRI - - KWFHDGHAV — QAGNRYSI IQG SS — LRVDDLQLSDSGT YTC 514 

III: :l I I: :| : h : : III I : ::ll II I I 
Db 518 ILPCQVQHDPLLDIMFAWYFNGALTDFKKDGSHFEKVGGSSSGDLMIRNIQLRHSGRYVC 577 

Qy .515 TASGERGETSWAATLTVEKPGSTSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQE 574 

I II I I II I II II :: h I I : : 

Db 578 MVQTGVDSVSSftffiLIVR--GS PGPPENVKVDEITDTTAQLSWTEGTD 623 

|y 575 KPGAVGPIIGYTVEYFSPDLQTGW- -IVAAHRVGDTQ- - -VTISGLTPGTSYVFLVRAEN 629 

: 1:111: : I II : II: h II II : I I 
)b 624 — SHSPVISYAVQARTP-FSVGWQSVRTVPEVIDGKTHTATWELNPWVEYEFRIVASN 679 

Qy 630 TQGISVPSGLSNVIKTIEADFDAA— -SAKDLSAARTLLTGKSV--ELIDASAINASAV 683 

I II I ::| II : I I I : ::| I II : 
Db 680 KIGGGEPSLPSEKVRTEEAAPEIAPSEVSGGGGSRSELVITWDPVPEELQNGGGF 734 

Qy 684 RLEWMLHVSADEKYVEGLRIHYKDASVPSAQYHSITVMDASAESFWGNLKKYTKYEFFL 743 

I : :: I : :| I I :: :: || : 
Db 735 GYWAFRPLGVTTWIQTWTSPDNPRYVFRNESIVPFSPYEVKV 778 

Qy 744 TPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGY 803 

: II I I : I: I: I :l : : I I I : Ihl II 
Db 779 GVYNNKGEG PFS PVTTVFS AEEEPTVAPSH I SAHSLSSS EIEVSWNT I PWKLSNGHLLGY 838 

Qy 804 KI EVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSK 856 

:: I : :M I II :| I : I : :: II Ihl 
Db 839 EVRYWNNGGEEESSRKVKVAGNQ TSAVLRGLKSNLAYYTAVRAYNSAGAGPFSA 892 

Qy 857 PISLFMDPTHHVHPP 871 

:: I II 
Db 893 TVNATTRKTPPSQPP 907 



RESULT 13 
A49356 

transient axonal glycoprotein TAG-1 precursor ■ human 

K Alternate names: axonin-1 
Species: Homo sapiens (man) 
Date: 20-Feb-1995 (tsequencejrevision 23-Mar-1995 Ktext_change 24-Sep-1999 
CiAccession: S35508; S28830; A49356 
R;Hasler, T. 

submitted to the embl Data Library, September 1992 
A; Reference number: S35508 
A; Accession: S35508 
A;Molecule type: mRNA 

A; Residues: 1-1040 <HAS> 

A; Cross -references: EMBL:X68274; NID:g36674; PIDN:CAA48335.1; PID:g36675 

RjHasler, T.H.; Racier, C; Stoeckli, E.T.; Zuellig, R.A.; Sonderegger, P. 
Eur. J. Biochem. 211, 329-339, 1993 
A; Title: cDNA cloning, structural features, and eucaryotic expression of human TAG-l/axc 
A; Reference number: S28830; MUID: 9314 5965 
A; Accession: S28830 
AjMolecule type: ihrna 
A;Residues: 1-296, T, 298-1040 <HA2> 
A; Cross -references: EMBL:X68274 

R;Tsiotra, p.c.; Karagogeos, D, ; Theodorakis, K,; Michaelidis, T.M.; Modi, W.S.; Furley, 
Genomics 18, 562-567, 1993 

A;Title: Isolation of the cDNA and chromosomal localization of the gene (TAXI) encoding 
A; Reference number: A49356; MUID: 94140354 
A; Accession: A49356 



A;Molecule type: mRNA 

AjResidues: 1-1001, V, 1003-1040 <TSI> 

A; Cross -references: GB:X67734 

C; Genetics: 

A; Gene: GDB : TAX ; TAXI 

A;Cross-references: GDB: 138782 

A; Map position: Iq32^1q32 

C;Superfamily: contactin; fibronectin type III repeat homology; immunoglobulin homolo 

C;Reywords: cell adhesion; glycoprotein 

F;l-28/Domain: signal sequence tstatus predicted <SIG> 

F;29-1040 /Product : axonal glycoprotein TAG-1 tstatus predicted <MAT> 

F;254-308/Domain: immunoglobulin homology <IMM1> 

F; 341-397/Domain : immunoglobulin homology <IMM2> 

F;76, 198, 204, 461, 477,498, 525, 775, 830, 904, 918, 940/Binding site: carbohydrate (Asn) (co 



Query Match ,' 8.7%; Score 644.5; DB 2; Length 1040; 

Best Local Similarity 23.7%; Pred. No. 3,9e-25; 

Matches 245; Conservative 149; Mismatches 416; Indels 223; Gaps 39; 

36 LLLV LVASNGLPAVRGQYQS--PRIIEHPTDLWKK- • -NEPATLNCKVEGRPEPT 86 

llll Ihl: : I : I : I :: : I I h II 
11 LLLVAAVALV5SSAWSSALGSQTTFGPVFEDQPLSVLFPEESTEEQVLLACRARASPPAT 70 

87 IEWFRDGEPVSTNEKKSHRVQFKDGALFFYRTMQGKKEQDGGEYWCVAKNRVGQAVSRHA 146 

I :| : llll I I II I I hi I II III I 
71 YRWKMNGTEMRLEPGSRH- -QLVGGNLVI - - "MNPTKAQDAGVYQCLASNPVGTWSREA 125 

147 SLQIAVLRDDFRVEPKDTRVAKGETALLECGPPRGIPEPTLIWI — KDGVPLDDLKAM 202 

I: h: ■: I : :| :| I II I : h : :| I : 
126 ILRFGFLQEFSKEERDPVKAHEGWGVMLPCNPPAHYPGLSYRWLLNEFPNFIPTDGRHFV 185 



Qy 



Qy 203 SFGASSRVRIVDGGNLLISNVEPIDEGNYRCIAQKLV- -GTRE- -SSYAKLIV 251 

I . ' III I: I III hi : : h I :hl : 

Db 186 S QTTGNLYIARTNASDLGNYSCLATSHMDFSTKSVFSKFAQLNLAAEDTRL 236 

Qy 252 - • -QVKPYFMKEPRDQVMLYGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDERS 308 

:| I I : I II I I 1:1 h: hi :|:: I : 

Db 237 FAPSIRARF— PAETYALVGQQVTLECFAFGNPVPRIKWRKVDGSLSPQWTTA— EPT 290 

Qy 309 LEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTRRPSNRRVGLNGVVQLPCMA 368 

hi ::: HIM 1 1 1 h h : : : 1 1 I I : I h : : :: I I 

Db 291 LQIPSVSFEDEGTYECEAENSKGRDTVQGRIIVQAQPEWLKVISDTEADIGSNLRWGCAA 350 

Qy 369 SGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSVVDSS 428 

:| I hi I : I : II I I I |: : : 1 1 I I I I :: 

Db 351 AGRPRPTVRWLRNGE PLASQNRVEVLA - GDLRFSKLSLEDSGMYQC VAENK 400 

Qy 429 TVRVFLQVSS\T)ERPPPIIQIGPANQTLP--RGSVATLPCRATGNPSPRIKWFHDGHAVQ 486 

:: • : I :: I : :| :| :||: I : I : 

Db 401 HGTIYASAELAVQALAPDFRLNPVRRLIPAARGGEILIPCQPRAAPKAWLWSRGTEILV 460 

Qy 487 AGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTV 531 

:| :: ;:l : :: III III I h : hi 

Db 461 NSSRVTVTPDGTLIIRNISRSDEGRYTCFAENFMGRANSTGILSVRDATKITLAPSSADI 520 



Qy 532 - 



--EKPG-- 

:||| 



Db 521 NLGDNLTLQCHASHDPTMDLTFTWTLDDFPIDFDRPGGHYRRTNVKETIGDLTILNAQLR 580 



536 STSLHRAADPSTY PAPPGTPRVLNVSRTSISLRWAKSQEKPGAVGP 581 

: I: : I III I :: hi II::: I 

581 HGGKYTCMAQTWDSASREATVLVRGPPGPPGGVWRDIGDTTIQLSWSRGFDNH---SP 637 

582 IIGYTVEYFSP DLQTGWIVAAHRVGDTQVT-ISGLTPGTSYVFLVRAENTQGIS 63/, 
I lh: :| ::l h h : : llll I I I I I I 

638 IARYTLQARTPPAGKWRQVRTN- • -PANIEGNAETAQVLGLTPWMDYEFRVIASNILGTG 694 

635 VPSGLSNVIRTIEADFDAASANDLSAARTLLTGRSVELIDASAINASAVRLEWMLHVSAD 694 

III I: hi II : : : II llll :l : : h : I 
695 EPSGPSSRIRTREA- APSVAPSGLSGG - - - -GGAPGELI - - - -VNWTPMSREYQ — NGD 742 

695 E-RYVEGLR — IHYKDASVPSAQYHSITVMDASAESFWGN--LKRYTKYEFFLTPFF 747 
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I: I I:: I I! 
743 GFGYLLSFRRQGSTHWQTARVPG" 



I I: 1 1 I :: II :| : : 
■-ADAQYFVYSNESVRPYTPFEVKIRSY- 791 



748 ETIEGQPSNSKTALTY--EDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYKI 805 

I I III I I: I I : I I I I II I IN 

792 -NRRGDGPESLTALVYSAEEEPRVAPTKVWAKGVSSSEMNVTW-EPVOQDMNGILLGYEI 849 

806 EV•SAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKP^SLPM0 , 863 

II: : II :: I II: :: =11 III : I 
850 RYWKAGDKEAAADRVRTAGLDTSARVSGLHPNTKYHVTVRAYNRAGTGPASPSANATTMK 909 

864 PTHHVHPPRAHPSG THDGRHEGQDLTYHNNGNIPPGDINP 903 

I III I :: | : | |: : : | 

910 P PPRRPPGNISWTFSSSSLSIKWDPVVPFRNESAVTGYRMLYQNDLH LTP 959 

904 TTHKRTTDYLSGP 916 

II ::: I 

960 TLHLTGRNWIEIP 972 



RESULT 14 
S25180 

neurofascin • chicken 
C ; Species : Gallus gallus (chicken) 

C;Date: 13-Jan-1995 tsequencejrevision 13-Jan-1995 ttext_change 21-Jan-2000 
C;Accession: S26180 

R;Volkmer, H.; Bassel, B.; Wolff, J.M.; Frank, R.; Rathjen, F.G. 
J. Cell Biol. 118, 149-161, 1992 
A; Title: Structure of the axonal surface recognition molecule neurofascin and its relati 
A;Reference number: S26180; MUID: 92317154 
A; Accession: S26180 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-1272 <VOL> 

A/Cross-references: EMBL:X65224; NID:g63659; PIDN;CAA46330,1; PID:g63660 
C;Superfamily: neural cell adhesion molecule LI; fibronectin type III repeat homology, 
F;279-336/Domain: immunoglobulin homology <IMM> 



Query Match 8.7%; Score 644; DB 2; Length 1272; 

Best Local Similarity 21.6%; Pred. No. 5.4e-25; 

Matches 272; Conservative 166; Mismatches 455; Indels 368; Gaps 39; 



Qy 



Db 



54 QSPRIIEHPT-DLWKKNEPATLNCKVEGKPEPTIEWFKDGEPVSTNEKKSHRVQFRDGA 112 

III: I :| : : I: :| | II I ::|: : : :: : I 

40 QPPTITKQSVKDYIVDPRDNIFIECEAKGNPVPTFSWTRNGKFFNVAKDPKVSMRRRSGT 99 

113 LFFYRTMQGRKEODGGEYWCVAKNRVGQAVSRHASLOIAVLRDDFRVEPKD-— TRVAK 168 

I I: : III I 1:1 I hi II:: : II: I : 

100 LVIDFHGGGRPDDYEGEYQCFARNDYGTALSSKIHLQVS— -RSPLWPREKVDVIEVDE 155 



Qy 169 GETALLECGPPKGIPEPTLIWIRDGV-PLDDLRAMSFGASSRVRIVDGGNLLISNVEPID 227 

I hi II 1:1 I : I: : |: Ml |:| III I 

Db ,156 GAPLSLQCNPPPGLPPPVIFWMSSSMEPIHQDKRVSQG QNGDLYFSNVMLQD 207 

Qy 228 -EGNYKCIAQ- *° NLVGTRESSYARLIVQVRPYFM- ■ 258 

: :| I I: :| : |: : : I II 

Db 208 AQTDYSCNARFHFTHTIQQRNPYTLRVRTRRPHNETSLRNHTDMYSARGVTETTPSFMYP 267 

Qy 259 -REPKDQVMLYGQTATFHCSVGGDPPPRVLWKKEEGNIPVSRARILHDEKSLEISN1TPT 317 

I::| I I I I ::| !: I :| : :: : |:| |||:: 
Db 268 YGTSSSQMVLRGVDLLLECIASGVPAPDIMWYRRGGELPAGRTKLENFNKALRISNVSEE 327 

Qy 318 DEGTYVCEAHNNVGQISARASLIVHAPPNFTRRPSNRRVGLNGWQLPCMASGNPPPSVF 377 

I I I I II :l I I: I I I : I I : :| I hill II: 
Db 328 DSGEYFCLASNKMGSIRHTISVRVRAAPYWLDEPQNLILAPGEDGRLVCRANGNPRPSIQ 387 

Qy 378 W TKEGVSTLMFPNSSHGRQYVAA 400 

I I: I I : l:|: |: I 

Db 388 WLVNGEPIEGSPPNPSREVAGDTIVFRDTQIGSSAVYQCNASNEHGYLLANAFVSVLDVP 447 



Qy 401 - 



■-DGTLQIT 407 

:|:|::: 



Db 448 PRIIAPRNQLIRVIQYNRTRLDCPFFGSPIPTLRWFKNGQGNMLDGGNYKAHENGSLEMS 507 



Db 



408 DVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPRGSVATLPCR 467 

1:11:1 I' I I ::: :| hi : I I II :| : Ml: I II 
508 MARREDQGIYTCVATNILGRVEAQVRLEV RDPTRIVRG PEDQWKRG SMPRLHCR 562 



Qy 468 ATGNPSPR-IRWFHDGHAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSW 525 

: III : I : : I I III || | : | 
Db 563 VKHDPTLKLTVTWLKDDAPLYIGNRMR-REDDGLTIYGVAERDQGDYTCVASTELDKDSA 621 

Qy 526 AATLTVERPGSTSLHRAAD-PSTYPAPPGTPKVLNVSRTSISLRWARSQERPGAVGPIIG 584 

I III : :l I I I I :: ::: I: I I : II 
Db 622 RAYLTVL- - -AIPANRLRDLPRERPDRPRDLELSDLAERSVKLTWIPGDDNN- - -SPITD 675 

Qy 585 YTVEYFSPDLQTG-WIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVI 64? 

I I:: U I : I: : hi :| I I I I I hll I 
Db 676 YIVQFEEDRFQPGTWHNHSRYPGNVNSALLSLSPYVNYQFRVIAVNDVGSSLPSMPSERY 735 

Qy 644 RTIEADFDAASANDLSAARTLLTGKSVELIDASAINASAVRLEWMLHVSADERYVEGLR- 702 

'•I * . I II : I: : : : : I ::| : I II 
Db 736 QT • SGARPEINPTGVQ— GAGTQKNNMEITW-TPLNATQAYGPNLRY 778 

Qy 703 -IHYRDASVPSAQYHSITVMDASAESFWGNLKKYTRYEFFLTPFFETIEGQPSNSKTAL 761 

: :: I ::: II Mill,: I |: :| : 

Db 779 IVRWRRRD-PRGSWYNETV— RAPRHWWNTPIYVPYEIRVQA-ENDFGRAPEPETYI 832 

Qy 762 TY--EDVPSAPPDNIQIGMYNQTAGWVRWT 789 

I II I I I :::! : I II : II 
Db 833 GYSGEDYPKAAPTDVRIRVLNSTAIALTWTRVHLDTIQGQLREYRAYFWRDSSLLRNLWV 892 



Qy 790 - 



■-PPPS-- 

I I 



Db 893 SRRRQYVSFPGDRNRGIVSRLFPYSNYRLEMWTNGRGDGPRSEVREFPTPEGVPSSPRY 952 

Qy 794 QHHNGNLYGYKIEVSAGNTMK — VLANMT LNATTTSVLLN 831 

:| II I II : Ml :: I : I I :| 
Db 953 LRIRQPNLESINLEWDHPEHPNGVLTGYNLRYQAFNGSKTGRTLVENFSPNQTRFTVQRT 1012 

Qy 832 NLTTGAVYSVRLNSFTKAGDGPY SKPISLFMDPTHHVHPPRAHPSGT 878 

: : I I : I: III M :: I I I : I 

Db 1013 DPISR--YRFFLRARTQVGDGEVIVEESPALLNEATPTPASTWLPPPTTELTPAATIATT 1070 

Qy 879 HDGRHEGQDLTYHNNGNIPPGDM PTTHRRTT 910 

I II :l III II 
Db 1071 TTTATPTTETPPTEIPTTAIPTTTTTTTATAASTVASTTTTAERAAAATTKQ 1122 

Qy 911 DYLSGPWLMVLVC ■ IVLLVLVISAAISMVYFRRRHQMTKELGHLSWSDNE 960 

I ■': I : hi I Mil:: : : : l| | I l|: 

Db 1123 ELAYTRNHVDIATQGWFIGLMCAIALLVLIL— LIVCFIKRSR GGKYPVRDNK 1173 

Qy 961 ITALNINSRESLWIDHHRGWRTADTDKDS— GLSESKLLSHVNSSQSNYNNSDGGTDYA 1017 

II I . I :|: ::hl: |:: I : :|: I II 
Db 1174 DEHLNPEDRNV: - EDGSFDYRSLESDEDNRPLPNSQT SLDGT I KQQESD — DSLVDYG 1227 

Qy 1018 E 1018 ■ ' 

I 

Db 1228 E 1228 



RESULT 15 
S05479 

neural cell adhesion molecule LI precursor - mouse 
C; Species: Mus musculus (house mouse) 

C;Date: 10-Sep-1999 »sequence_revision 1Q-Sep-1999 ttext change 10-Sep-1999 
C;Accession: S05479; B60850; S22167 

R;Moos, M. ; Tacke, R.; Scherer, H.; Teplow, d.j Frueh, R.; Schachner, M. 
Nature 334, 701-703, 1988 

A; Title: Neural adhesion molecule Ll as a member of the immunoglobulin super family wi 
A;Reference number: S05479; MUID: 88318924 
A; Accession; S05479 . 
A; Molecule type: mRNA 
A; Residues: 1-1260 <MOO> 

A; Cross -references: EMBL:X12875; NID:g53336; PIDN:CAA31368.1; PID:g53337 
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A; Note: the authors translated the codon CCT for residue 166 as Leu, ACT for residue 39f 
A;Note: part of this sequence, including the amino end of the mature protein, was confii 
R;Rathjen, F.G.; Wolff, J.M.; Frank, R.; Bonhoeffer, F.; Rutishauser, 0. 
J. Cell Biol. 104, 343-353, 1987 

AjTitle: Membrane glycoproteins involved in neurite fasciculation, 
A;Reference number: A60850; MDID: 87109457 
A; Accession: B60850 
A; Molecule type: protein 
A;Residues: 20-28, 'XX', 31-36 <RAT> 

R;Kohl, A.; Giese, K.P.; Mohajeri, M.H.; Montag, D.; Moos, M.; Schachner, M. 
submitted to the EHBL Data Library, December 1991 
A; Description: Analysis of promoter activity and 5' genomic structure of the neural eel] 
A; Reference number: S22167 
A; Accession: S22167 
A; Molecule type: DNA 

A;Residues: 1-165, 'L' , 167-189, 'E' ,191-281, 'S\ 283-395, 'S' ,397-514, 'APEKNPVDV' ,524, ' GEGNE 
A;Cross-references: EMBL:X63511 
C; Genetics: 

A;Introns: 26/1; 31/1; 66/2; 133/1; 174/1; 231/1; 268/2; 330/1; 374/1; 422/1; 459/2 
A;Note: the list of introns may be incomplete 

C;Superfamily: neural cell adhesion molecule LI; fibronectin type III repeat homology; 
^^Keywords: alternative splicing; cell adhesion; duplication; glycoprotein; transmembrai 
Bll-19/Domain; signal sequence f status predicted <SIG> 
T;20-1260/Product: neural cell adhesion molecule Istatus experimental <MAT> 

F;256-313/Domain: immunoglobulin homology <IMM1> 

F;440-498/Domain; immunoglobulin homology <IMM2> 

F;531-592/Domain: immunoglobulin homology <IMM3> 



Query Match 8.6%; Score 641.5; DB 1; Length 1260; 

Best Local Similarity 23.04; Pred. No. 7 .le-25; 

Matches 275; Conservative 167; Mismatches 458; Indels 295; Gaps 48; 

Qy 35 WLLLVLVASNGLP AVRGQYQS PRI I E HPTDLWKKNEPATLNCKVEGKPEPTI 87 

I II I : I : :h ::| I III : :| h hh 

Db 9 WPLL-LCSPCLLIQIPDEYKGHHVLEPPVITEQSPRRLWFPTDDISLRCEARGRPQVEF 67 

Qy 88 EWFKDGEPVSTNEKKS* ■ -HRVQFKDGALFFYRTMQGRK- - -EQDGGEYWCVAKNRVGQA 141 

I III I: I = I: |::| :: I I I I l::l I 

Db 68 RWTKDGIHFRPKEELGVWHEAPY - SGSF — T IEGNNSFAQRFQGIYRCYASNKLGT A 122 

Qy 142 VSRHASLQ1AVLRDDFRVEPKDT-— RVAKGETALLECGPPKGIPEPTLIWIKDGV— 194 

:| :| :: : l|:| I :||: :| I II I : |: ! 
Db 123 MSH— -EIQLVAEGAPKWPKETVKPVEVEEGESWLPCNPPPSAAPPRIYflMNSKIFDI 178 

Qy 195 PLD DLK 200 

1:1 I I 

Db 179 KQDERVSMGQNGDLYFANVLTSDNHSDYICNAHFPGTRTIIQKEPIDLRVKPTNSMIDRK 238 

«201 - • ■ AMSFGASSRVRIVDGGNLLIS 221 
:|||: : I :|:: 
239 PRLLFPTNSSSRLVALQGQSLILECIAEGFPTPTIKWLHPSDPMPTDRVIYQNHNKILQL 298 

Qy 222 -NVEPIDEGNYKCIAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGG 280 

II hi I 1:1:1 :|: :l : I: ll::::|: : hll I I I 

Db 299 LNVGEEDDGEYTCLAENSLGSARHAY-YVTVEAAPYWLQKPQSHLYGPGETARLDCQVQG 357 

Qy 281 DPPPRVLWKREEGNIPVSRARILHDER SLEISNITPTDEGTYVCEAHNNVGQIS 334 

I h: I: :| : hi II :lh III III I I : 
Db 358 RPQPEITWRIN GMSMETVNRDQRYRIEQGSLILSNVQPTDTMVTQCEARNQHGLLL 413 

Qy 4335 ARASL - IVHAPPNFTKRPSNKKVGLNG - WQLPCMASGNPPPSVFWTKEGVSTLMFPNSS 392 

I I : :| I - : : : : I I II II III I I :|:: 
Db 414 ANAYIYWQLPARILTKDNQTYMAVEGSTAYLLCRAFGAPVPSVQWLDEEGTTVL Q 469 

Qy 393 HGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPA 452 

I : 1:111 I I:: I I I I I ::: I: III : Ml 
Db 470 DERFFPYANGTLSIRDLQANDTGRYFCQAANDQNNVTILANLQVKEATQ ITQGPR 524 

Qy 453 NQTLPKGSVATLPCRATGNPS-PRIRWFHDGHAVQA— GNRYSIIQGSSLRVDDLQLS 507 

: II: I hi: :H III: ::| |: I : I I 
Db 525 SAIEKKGARVTFTCQASFDPSLQASITWRGDGRDLQERGDSDKY-FIEDGKLVIQSLDYS 583 



Qy 508 DSGTYICTASGERGET-SWAATLTVEKPGSISLHRAADPSTYPAPPGTPKVLNVSRTSIS 566 

I I I:TII I I I I I I II :l : :: : 

Db 584 DQGNYSCVASTELDEVESRAQLLWGSP'GPVPHLELSDRHL LKQSQVH 631 

Qy 567 LRWAKSQEKPGAVGPIIGYTVEYFSPDL-QTGWIVAAHRVGDTQVTISGLTPGTSYVFLV 625 

I h ::: ' II I :|: :: I h I hi I I I 

Db 632 LSWSPAEDHN — SPIEKYDIEFEDKEMAPEKWFSLGKVPGNQTSTTLKLSPYVHYTFRV 688 

Qy 626 RAENTQGISVPSGLSNVIRTIEADFDAASANDLSAA RTLLTGKSVELIDASAI 678 

I I I II :| : I II I I : ::| I : :| :| 

Db 689 TAINKYGPGEPSPVSESWTPEA- - -APEKNPVDVRGEGNETNNMVITWRPLRWMDWNAP 745 

Qy 679 NASAVRLEWMLHVSADEKYVEGLRIHYKDASVPSAQYHSIIVMDASAESFWGNLKKYIK 738 

:: : | : : | || | : 

Db 746 QIQ-YRVQWR PQGKQETWRKQTVSDPFLWSNTSTFVP 782 

Qy 739 YEFFLTPFFETIEGQPSNSKTALTY — EDVPSAPPDNIQIGMYNQIAGWVRWTPPPSQ 794 

II : :': I : :| III h I ::| : III I 

Db 783 YEIKV — QAVNNQGKGPEPQVTIGYSGEDYPQVSPELEDITIFNSSTVLVRWRPVDLA 838 

Qy 795 HHNGNLYGYKI EVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNS 845 

hi II : : | : | ::: : | III :|: I : I I : : 

Db 839 QVKGHLKGYNVTYWWKGSQRKHSKRHIHK--SHIWPANTTSAILSGLRPYSSYHVEVQA 895 

Qy 846 FTKAGDGP YSKPISLFMDPTH HVHPPRAHPSGTHDG — 881 

I I II :| I : I I ! II :! :| 1 

Db 897 FNGRGLGPASEWTFSTPEGV— PGHPEALHLECQSDTSLLLHWQPPLSH-NGVLTGYLL 552 

Qy 882 -RH--EGQ-" DLTYHNNGNIPPGDINPTTHKRTTDYLSGPWLMVLVCIVL S2o 

I lh ' :| II h I h : I II :: 

Db 953 SYHPVEGESKEQLFFNLSDPELRTHNLTNLNP-DLQYRFQLQATTQQGGPGEAIVREGGT 101: 

Qy 927 LVLV ISA AISMVYFK RRHQMTRELGHLSWSDNEITALNIN ?5 7 

: I III :l II 11:11 I h: :: 

Db 1012 MALFGKPDFGNISATAGENYSWSWVPRKGQCNFRFHILFKALPEGKVSPDHQPQPQYVS 1071 

Qy 968 SKESLWIDHHRGWR-TADTDKDSGLSESKLLSHVNSSQSNYNNSDGGTDYAEVDT 1021 

:| : -\ II : I : hi I ::| II I I 
Db 1072 YNQSSYTQ-— WNLQPDTRYEIHLIKEKVLLHHLDVKTN GTGPVRVST 1116 



Search completed: January 22, 2001, 12:24:13 
Job time: 1930 sec . 
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GenCore version 4.5 
Copyright (c) 1993 ■ 2000 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on; 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



January 22, 2001, 12:26:30 ; Search time 162.41 Seconds 
(without alignments) 
277,386 Million cell updates/sec 

US-09-540-245A-15 
7427 

1 MHPMHPENHAIARSTSTTNN SCLYAEAGEPAPRQMTAKNT 1395 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



hrched: 



88757 seqs, 32294092 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

n DB seq length: 2000000000 



Post-processing: Minimum Match jO* 
Maximum Match'3100% 
Listing first]45 summaries 



•I 

88757 



Database : 



SwissProt. 



Pred. No. is the number of results predicted by chance to have a 
score greater than or eqiial to the score of the result being 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Query 



NO. 


Score 


Match Length 


DB 


ID 


Description 


1 


759.5 


10.2 


1377 


11 


NE01JAT 


P97603 rattus norv 


2 


731.5 


9.8 


1493 


1' 


NE01JOUSE 


P97798 mus musculu 


3 


710.5 


9.6 


1461 


1 


NEOIJUMAN 


Q92859 homo sapien 


4 


677 


9.1 


1443 


1 


NEOl.CHICK 


Q90610 gallus gall 


5 


661 


8.9 


2012 


1 


DSCAJUMAN 


060469 homo sapien 


6 


658 


8.9 


1040 


1 


AXOlJAT 


P22063 rattus norv 


7 


644.5 


8.7 


1040 


1 


AXOl HUMAN 


Q02246 homo sapien 


8 


641.5 


8.6 


1260 


1 


CAMLJOUSE 


P11627 mus musculu, 


9 


639.5 


8.6 


1447 


1 


DCCJOUSE 


P70211 mus musculu' 


10 


637,5 


8.6 


1447 


1 


DCCJOMAN 


P43146 homo sapien' 


11 


633.5 


8.5 


1257 


1' 


CAMLJUMAN 


P32004 homo sapien 


12 


627 


8.4 


1259 


1 


CAMLJAT 


Q05695 rattus norv 


13 


626.5 


8.4 


1036 


1 


AXOl.CHICK 


P28685 gallus gall 


14 


624 


8.4 


1284 


1 


NRCA_CHICK 


P35331 gallus gall 


15 


614.5 


8.3 


1018 


1 


CONTJUMAN 


Q12860 homo sapien 


16 


611 


8.2 


1239 


1 


NRG.DROME 


P20241 drosophila 


17 


602.5 


8.1 


1020 


1 


CONTJ10USE 


P12960 mus musculu 


18 


595.5 


8.0 


1010 


1 


CONT.CHICK 


P14781 gallus gall 


19 


547 


7.4 


4393 


1 


PGBMJUMAN 


P98160 homo sapien 


20 


546 


7.4 


1897 


1 


PTPFJOMAN 


P10586 homo sapien 


21 


543 


7.3 


1912 


1 


PTPDJUMAN 


P23468 homo sapien 


22 


540.5 


7,3 


1266 


1 


NGCA_CHICK 


Q03696 gallus gall 


23 


530.5 


7.1 


2029 


1 


LARJ5ROME 


P16621 drosophila 


24 


514.5 


6.9 


1091 


1 


NCAl.CHICK 


P13590 gallus gall 


25 


499 


6.7* 


1115 


1 


NCAIJIOOSE 


P13595 mus musculu 


26 


497 


6.7 


3707 


1 


PGBMJOUSE 


Q05793 mus musculu 


27 


480.5 


6.5 


1070 


1 


PTK7JUMAN 


Q13308 homo sapien 


28 


460 


6.2 


1088 


1 


NCAlJENLA 


P16170 xenopus lae 


29 


456 


6.1 


1051 


1 


PTK7.CHICK 


Q91048 gallus gall 


30 


449,5 


6.1 


858 


1 


NCA1.RAT 


P13596 rattus norv 


31 


449 


6.0 


853 


1 


NCAlJOVIN 


P31836 bos taurus 


32 


448 


6.0 


1092 


1 


NCA2JENLA 


P36335 xenopus lae 


33 


439 


5.9 


848 


1 


NCAlJUMAN 


P13591 homo sapien 



34 


437 


5.9 


837 '. 


NCM2_MOUSE 


035136 


mus musculu 


35 


431.5 


5.8 


761 


NCA2JTOMAN 


P13592 


homo sapien 


36 


431 


5.8 


725 


NCA2_MOUSE 


Pi 3594 


mus musculu 


37 


415 '' 


5.6 


837 


NPM9 HrTVAN 




homo sapien 


38 


414.5 


5^6 


1913 


KMLSJUMAN , 


Q15746 


homo sapien 


39 


396.5 


■5.3 


1906 


KMLS.CHICK 


P11799 


gallus gall 


40 


355 


4.8 


811 


FS22J5ROME 


P34083 


drosophila 


41 


353.5 


4.8 


873 


FS21.DROME 


P34082 


drosophila 


42 


349 


4.7 


2481 


UN52_CAEEL 


Q06561 


caenorhabdi 


43 


323 


4.3 


898 


FAS2JCHAM 


P22648 


schistocerc 


44 


314 


4.2 


1367 


VGR2_MOUSE 


P35918 


mus musculu 


45 


314 


4.2 


1694 


SNJOUSE 


Q62230 


mus musculu 



RESULT 
NE01_SAT 
ID 
AC 
DT 



STANDARD; PRT; 1377 AA, 



NEOl RAT 
P97603; 

01-OCT-200Q (Rel. 40, Created) 

DT Ql-OCT-2000 (Rel, 40, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEOGENIN PRECURSOR (FRAGMENT). 

GN NEOl OR NGN. 

OS Rattus norvegicus (Rat). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

RN [1] 

RP SEQUENCE FROM N.A, 

RC TISSUE-BRAIN; 

RX MEDLINE-97015074; PubMed-8861902; 

RA Keino-Masu K., Masu M., Hinck L., Leonardo E.D., Chan S.S.-Y., 

RA Culotti J.G., Tessier-Lavigne M.; 

rt "Deleted in Colorectal Cancer (DCC) encodes a netrin receptor,"; 

RL Cell 87:175-185(1996). 

CC -!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 

CC TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 

CC DIFFERENTIATED STATE, MAY ALSO FUNCTION AS A CELL ADHESION 

CC MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

CC -I- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

CC SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN - LI KE C2-TYPE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III -LIKE DOMAINS. 

CC -I- SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 

CC TUMOR SUPPRESSOR PROTEIN DCC, 

CC 

CC This SWISS-PROF entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation ■ 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 

DR EMBL; U68726; AAB41100.1; -. 

DR HSSP; P56276; HLK. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 6. 

DR PFAM; PF00047; ig; 4. 

Transmembrane; Immunoglobulin domain; Glycoprotein; Signal. 
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NONJER 


1 


1 




FT 


SIGNAL 


<1 


2 


POTENTIAL. 


FT 


CHAIN 


3' 


1377 


NEOGENIN. 


FT 


DOMAIN 


3 


1074 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1075 


1095 


POTENTIAL. 


FT 


DOMAIN 


1096 


1377 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


36 


105 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


135 


197 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


232 


296 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


324 


386 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


405 


502 


FIBRONECTIN TYPE-III. 
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FT 


DOMAIN 


505 


598 


FIBRONECTIN TYPE-III, 




FT 


DOMAIN 


599 


698 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


704 


798 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


819 


919 


FIBRONECTIN TYPE-III, 




FT 


DOMAIN 


920 


1021 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


1087 


1090 


POLY-VAL. 








42 


42 


N-LINKED (GLCNAC. . . 


(POTENTIAL). 


FT 


CARBOHYD 


179 


179 


N-LINKED (GLCNAC, . , 


(POTENTIAL) . 


FT 


CARBOHYD 


295 


295 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


439 


439 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


458 


458 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


608 


608 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


684 


684 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


878 


878 


N-LINKED (GLCNAC." . . 


(POTENTIAL) . 


so 


SEQUENCE 


1377 AA; 150637 MW; E514ED8ABD1A63A9 CRC64; 



Query Match 10.2%; Score 759.5; DB 1; Length 1377; 

Best Local Similarity 23.1%; Pred. No. le-30; 

Matches 328; Conservative 179; Mismatches 544; Indels 371; Gaps 55; 

tl57 FRVEPKDTRVAKGETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGG 216 
I III II H : :| I I I : I III : I ::: I 

24 FLVEPVDTLSVRGSSVILNCS - AYSEPSPNIEWKKDGT FLNLVSDDRRQLLPDG 76 

Qy 217 NLLISNV EPIDEGNYKCIA--QNLVGTRESSYAKLIVOVKPYFMKEPKDQVMLY 268 

:| Nil :| III hhl II II I III I II :|: : 
Db 77 SLFISNWHSKHNKP-DEGFYQCVATVDNL-GTIVSRTAKLAVAGLPRFTSQPEPSSIYV 134 

Qy 269 GQTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDEK SLEISNITPTDEGT 321 

: :| I I I I I:: :| :| |:: :| III I III 
Db 135 GNSGILNCEVNADLVPFVRWEQ NRQPLLLDDRIVKLPSGTLVISNATEGDEGL 187 

Qy 322 YVC EAHNNVGQISARASLIVHAPPNFTKRPSN--KKVGLNGWQLPCM 367 

II II I I I :l I III: I :| : I lit: 

Db 188 YRCIVESGGPPKFSDEAELKVLQESEEMLDLV FLMRPSSMIKVIGQSAV--LPCV 240 

Qy 368 ASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSVVDS 427 

III I I = II : : I II : I I : I : I : I ! ::| I I I I |: 

Db 241 ASGLPAPVIRWMK— NEDVLDTESSGRLALLAGGSLEISDVTEDDAGTYFC-— VADN 293 

Qy 428 STVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFHDGHAVQA 487 

: I : II :: III : I II hi :|| :| I 
Db 294 GNKTIEAQAELTVQVPPEFLK-QPANIYARESMDIVFECEVTGKPAPTVKTONGDWIP 352 

Qy 488 GNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTSLHRAADPST 547 

: : h: I II I I I I : I I I : : II 
Db 353 SDYFKIVKEHNLQVLGLVKSDEGFYQCIAENDVGNAQAGAQLIILE HAPATTGP 406 

1548 YPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGD 607 
I: I II I I I I I : |:| I : : : |: 
407 LPSAPRDWASLVSTRFIKLTWRTPASDPH--GDNLTYSVFYTKEGVARERVENTSQPGE 464 

Qy 608 TQVT ISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKT IEADFDAASANDLSAARTLLTG 667 

INI I I I 1:1 I 1:1 I II I: :l I 
Db 465 MQVTIQNLMPATVYIFKVMAQNKHG— -SGESSAPLRVE TQ 502 

Qy 668 KSVEL IDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDASVPSAQYHSITVM 721 

1:1 | | I : ::: : I :| : : :: :::| : I : 
Db 503 PEVQLPGPAPNIRAYATSPTSITVTWETPLSGNGE - IQNYKLYYMEKGTDKEQ DV 556 

Qy 722 DASAESFWGNLKKYTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQ 781 

I I: I: : llllhl I : : : I : I Mill I |: : : I 
Db 557 DVSSHSYTINGLKKYTEYSFRVWNKHGPGVSTQDVAVRTLSDVPSAAPQNLSLEVRNS 616 

Qy 782 TAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSV 841 

: : I II I II : III! : : = I I I: I I I: 
Db 617 KSIVIHWQPPSSATQNGQITGYKIRYRKASRKSDVTETWTGTQLSQLIEGLDRGTEYNF 676 

Qy 842 RLNSFTKAGDGPYSKPIS-LFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPG 899 

, |: : I I II : : I II 
Db '677 RVAALTVNGTGPATDWLSAETFESDLDESRVPEV-PSSLH 715 



Qy 900 DINPTTHKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTK-ELGHLSWS 957 

: I I II I :l : :: : : ;|; 
Db 716 -VRP LVTSIWSWTPPENQNIWRGYAIGY 744 



Db 



958 DNEITALNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGG---- 1013 

: :' ::: :|: : : I : I II : :|| I 

745 G IGS PHAQT I KVDYKQRYYT IENLDPS SHY VITLKAFNNVG EG I PL Y 791 



Qy 1014 TDYAEVDTRNLTTFYNCRKSPDNPTPYATTMI-IGTSSSETCTKTTSIS-A 1062 

ll::|| : :| I I I |: :| :| I |: I 

Db 792 ESAVTRPHTDTSEVDLFVI NAPYTPVPDPTPMMPPVGVQASILSHDTIRITWA 844 

Qy 1063 DKDSGTHSPYSDA FAGQVPAVPWKS NYLQYPVEPINWSEF 1103 

I I' :|: : =11 h :|| ::| II 
Db 845 DNSLPKHQKITDSRYYTVRWKTNIPANTKYKNANATTLSYLVTGLKPNTLYEFSVMVTKG 904 

Qy 1104 - - ' -LPPPPEHPPPSSTYGYAQGSPES SRKSSKSAGSGI-S 1139 

I II I :| I : I : I I I I 

Db 905 RRSSTWSMTAHGATFELVPTSPPKDVTWSKEGKPRTIIVNWQPPSEANGKITGYIIYYS 964 

Qy 1140 TNQSILNASIHSSSSGGFSAWGVSP QYAVACPPENVYSNPLSA 1182 

I: :ll Li I : I III: 

Db 965 TD — VNAEIHD WVIEPWGNRLTHQ IQELTLDT PY YFKIQARN - - SKGMG P 1011 

Qy 1183 VAGGTQNRYQTTPTNQHPPQLP-AYFATTGPGGAVP PNHLP — 1222 

:: I I II :: : I || :| |: I 

Db 1012 MSEAVQFR— TPKADSSDKMPNDQALGSAGKGGRLPDLGSDYKPPMSGSNSPHGSPTSP 1068 

Qy 1223 : FATQRHAASEYQAGLNAARCAQSRACNS 1250 

I Ml : : : : II I 

Db 1069 LDSNMLLVIIVSIGVITIWWIIAVFCTRRTTSHQKK KRAACKSVNGSHK 1119 

Qy 1251 — CDALATPS PMQP-— PPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQ 1294 

I : ,1' ::l I I II I: II : I 
Db 1120 YKGNCKDVKPPDLWIHHERLELKPIDKSPDPNPVMTD-TPIPRNSQDITPV 1169 

Qy 1295 CSSECSDHSRSSQSHKRQLQLEEHGSS AKQRGGHHRRRAPV-VQPCMESENENM 1347 

1:1 ;l 1:1: II I :|| : I II :| 
Db 1170 DNSMDSNIHQRRNSYRGHESEDSMSTLAGRRGMRPKMMMPFDSQPPQQSVRNTP 1223 

Qy 1348 LAEYEQRQYTSDCCNSSR--EGDTCSCSEGSCLYAEAGEPAP 1387 

: !'■ II : II I I I ::h I 
Db 1224 STDTMPASSSQTCCTDHQDPEGATSSSYLASSQEEDSGQSLP 1265 



ID NEOlJQUSE .STANDARD; PRT; 1493 AA. 

AC P97798; 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rel. 40, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEOGENIN PRECURSOR . 

GN NEOl OR NGN. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrate; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

RN [1] 

RP SEQUENCE FROM N.A., AND ALTERNATIVE SPLICING. 

RC TISSUE-BRAIN; i 

RX MEDLINE-97407661; PubMed-9264410; 

RA Keeling S.L., Gad J.M,, Cooper H.M.; 

RT "Mouse neogenin, a DCC-like molecule, has four splice variants and is 

RT expressed widely in the adult mouse and during embryogenesis,"; 

RL Oncogene 15:691-700(1997). 

CC -!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
CC TRANSITION .OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
CC DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
CC MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

CC -I- SUBCELLULAR LOCATION; TYPE I INTEGRAL MEMBRANE PROTEIN. 

CC -I* ALTERNATIVE PRODUCTS: AT LEAST 5 ISOFORMS; 1 (SHOWN HERE), 2, 3, 4 
CC AND 5; ARE PRODUCED BY ALTERNATIVE SPLICING. THE EXPRESSION OF 
CC ISOFORMS 3, 4 AND 5 ARE DEVELOPMENTALLY REGULATED. 
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CC -!- TISSUE SPECIFICITY: WIDELY EXPRESSED. 

CC -!- DEVELOPMENTAL STAGE: EXPRESSED UBIQUITOUSLY THROUGHOUT THE MID TO 
CC LATE STAGES OF GESTATION AND IN ADULT TISSUES. STRONG EXPRESSION 
CC IS OBSERVED IN THE VENTRAL REGION OF THE VENTRICULAR ZONE OF THE 
CC E15.5 MOOSE NEURAL TUBE/ AS WELL AS IN THE VENTRICULAR ZONES OF 
CC THE MESENCEPHALON AND RHOMBENCEPHALON. ISOFORMS 3 AND 4 ARE 
CC EXPRESSED AT HIGHER LEVEL COMPARED TO OTHER ISOFORMS BETWEEN Ell. 5 
CC AND E16.5. 

CC -!■ SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS, 

CC •!• SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
CC TUMOR SUPPRESSOR PROTEIN DCC. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation ■ 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

•modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

cc 

DR EMBL; Y09535; CAA70727.1; -. 

DR HSSP; P02751; 1TTG. 

DR MGD; MGI:1097159; NEOl. 

DR INTERPRO; IPR0Q1777; 

DR INTERPRO; IPR003QQ6; -. 

DR PFAM; PF00041; fn3; 6. 

DR PFAM; PF00Q47; ig; 4, 

DR PRINTS; PR00014; FNTYPEIII. 

KW Transmembrane; Immunoglobulin domain; Glycoprotein; Signal; 



RW 


Alternative splicing. 




FT 


SIGNAL 


1 


36 


POTENTIAL. 


FT 


CHAIN 


37 *1493 


NEOGENIN. 


FT 


DOMAIN 


37 


1136 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1137 


1157 


POTENTIAL. 


FT 


DOMAIN 


1158 


1493 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


78 


147 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


177 


239 


IG-LIRE C2-TYPE DOMAIN. 


FT 


DOMAIN 


274 


338 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


366 


428 


IG-LIRE C2-TYPE DOMAIN. 


FT 


DOMAIN 


467 


564 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


567 


660 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


661 


760 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


766 


860 


FIBRONECTIN TYPE- III, 


FT 


DOMAIN 


881 


981 


FIBRONECTIN TYPE- III. 




DOMAIN 


982 


1083 


FIBRONECTIN TYPE- III. 




DOMAIN 


1149 


1153 


POLY-VAL. 




CARBOHYD 


84 


84 


N-LINRED (GLCNAC. . .) (POTENTIAL) , 


FT 


CARBOHYD 


221 


221 


N-LINRED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


337 


337 


N-LINRED (GLCNAC , .) (POTENTIAL) . 


FT 


CARBOHYD 


501 


501 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


520 


520 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


670 


670 


N-LINRED (GLCNAC. . .) (POTENTIAL) , 


FT 


CARBOHYD 


746 


746 


N-LINRED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


940 


940 


N-LINRED (GLCNAC. . .) (POTENTIAL). 


FT 


VARSPLIC 


442 


461 


MISSING (IN ISOFORM 2), 


FT 


VARSPLIC 


863 


878 


MISSING (IN ISOFORM 3). 


FT 


VARSPLIC 


1086 


1096 


MISSING (IN ISOFORM 4). 


FT 


VARSPLIC 


1279 


1331 


MISSING (IN ISOFORM 5). 



SEQUENCE 1493 AA; 163159 MW; 441DE919D5E17C0E CRC64; 



Query Match 9.8%; Score 731.5; DB 1; Length 1493; 

Best Local Similarity 22.6%; Pred, No, 2.9e-29; 

Matches 325; Conservative 181; Mismatches 538; Indels 395; Gaps 57; 

Qy 124 EQDGGEYWCVARNR VGQAVSRHASLQIAVLRDD 156 

I:: I I : :| :|: II:: I 

Db 4 EREAGRLLCTSSSRRCCPPPPLLLLLPLLLLLGRPASGAAATKSGPRRQSQGASVRTFTP 63 

Qy 157 - - FRVEPRDTRVAKGETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVD 214 

I III II :| : :| I I I : I III :: : I ::: 

Db 64 FYFLVEPVDTLSVRGSSVILNCS'AYSEPSPNIEWRKDGT FLNLESDDRRQLLP 116 



Qy 215 GGNLLISNV--,— -EPIDEGNYKCIA-QNLVGTRESSYAKLIVQVKPYFMKEPKDQVM 266 

hi llll ' M III 1:1: || II I III I II :|: : 
Db 117 DGSLFISNWHSRHNRP-DEGFYQCVATVDNL-GTIVSRTAKLTVAGLPRFTSQPEPSSV 174 

Qy 267 LYGQTATFHCSVGGDPPPRVLWRKEEGNIPVSRARILHDEK SLEISNITPTDE 319 

I :| :l I I III:: :l =1 I" :l III I I 
Db 175 YVGNSAILNCEVNADLVPFVRWEQ NRQPLLLDDRIVKLPSGTLVISNATEGDG 227 

Qy 320 GTYVCEAHN-NVGQISARASLIVHAPPN FTRRPSN- -KKVGLNGWQLPCMASG 370 

III : . : M I I I III: I hi III: II 
Db 228 GLYRCIVESGGPPKFSDEAELKVLQDPEEIVDLVFLMRPSSMMKVTGQSAV-LPCWSG 285 

Qy 371 NPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTV 430 

Mill'.': : 111:11 l:|:|| ::| I I I : h 
Db 286 LPAPWRWMR— NEEVLDTESSGRLVLLAGGCLEISDVTEDDAGTYFC— -IADNGNK 338 

Qy 431 RVFLQVSSVDERPPPIIQIGPANQTLPRGSVATLPCRATGNPSPRIKWFHDGHAVQAGNR 490 

I I : II :: III : I II hi :ll :l I : 
Db 339 TVEAQAELTVQVPPGFLK-QPANIYAHESMDIVFECEVTGRPTPTVRWVRNGDWIPSDN 397 

Qy 491 YSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTV EKPGST 537 

: I:: :|:| I II I I I I : I I I : :| 
Db 398 FKIVKEHNLQVLGLVRSDEGFYQCIAENDVGNAQAGAQLIILEHDVAIPTLPPTSLTSAT 457 

Qy 538 SLHRAADPSTYPAPPGTPRVLNVSRTS- - -ISLRWARSQEKPGAVGPIIGYTVEYFSPDL 594 

: I I 1:1 I I: : I I I II I I : |:| I : 
Db 458 TDHLA-PATTGPLPSAPRDWASLVSTRFIKLTWRTPASDPH-GDNLTYSVFYTKEGV 513 

Qy 595 QTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAAS 654 

: llll MINI hi I II |: :l 
Db 514 DRERVENT SQPGEMQVT IQNLMPATVY I FKVMAQNKHG — SGESSAPLRVE 562 

Qy 655 ANDLSAARTLLTGRSVEL IDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDA 708 

I 1:1 I I I : ::: : I :| : : :: :::| : 
Db 563 -TQPEVQLPGPAPNIRAYATSPTSITVTWETPLSGNGE-IQNYKLYYMEK 610 

Qy 709 SVPSAQYHSITVMDASAESFWGNLKKYTKYEFFLTPFFETIEGQPSNSKTALTYEDVPS 768 

I ' :l I: h : HIM M : : I : I Ml 
Db 611 GTDREQ-— '-DIDVSSHSYTINGLRKYTEYSFRWAYNRHGPGVSTQDVAVRTLSDVPS 665 

Qy 769 APPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYRIEVSAGNTMRVLANMTLNATTTSV 828 

I I I: : : I : : III I II : Ml : : : I I 
Db 666 AAPQNLSLEVRNSRSIVIHWQPPSSTTQNGQITGYKIRYRKASRRSDVTETLVTGTQLSQ 725 

Qy 829 LLNNLTTGAVYSVRLNSFTKAGDGPYSRPIS--LFMDPTHHVHPPRAHPSGTHDGRHEGQ 886 

I: I. .1 l<: I: M III : :| I llll 
Db 726 LIEGLDRGTEYNFRVAALTVNGTGPATDWLSAETFESDLDETRVPEV-PSSLH 777 

Qy 887 DLTYHNNGNIP.PGDINPTTHKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFRRKHQM 946 

: I || | :| : :: : 

Db 778 -—VHP LVTSIWSWTPPENQNIV 798 

Qy 947 TK--ELGHLSWSDNEITALNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQ 1004 

: :|: J : ::: :|: : : | : | || : 

Db 799 VRGYAIGY-: GIGSPHAQTIKVDYRQRYYTIENLDPS SHYVITL 840 

Qy 1005 SNYNNSDGG TDYAEVDTRNLTTFYNCRRSPDNPTPYATTMI - - IGTSS 1050 

:|| I 1 II Ml : :l I I I I: :l : 

Db 841 RAFNNVGEGIPLYESAVTRPHTDTSEVDLFVI NAPYTPVPDPTPMMPPVGVQA 893 

Qy 1051 SETCTKTTSIS-ADRDSGTHSPYSDA FAGQVPAVPWKS NYLQYPVEP 1097 

I I I: II I :|: : M |: M ::| 
Db 894 SILSHDTIRITWADNSLPKHQRITDSRYYTVRWRTNIPANTRYKNANATTLSYLVTGLKP 953 

Qy 1098 INWSEF- — : LPPPPEHPPPSSTYGYAQGSPES SR 1128 

II * I II I :| I : I 

Db 954 NTLYEFSVMVTKGRRSSTWSMTAHGATFELVPTSPPRDVTWSKEGKPRTIIVNWQPPSE 1013 

Qy 1129 KSSRSAGSGI-STNQSILNASIHSSSSGGFSAWGVSP QYAVA 1169 

MM' II: :|| II Ml I 

Db 1014 ANGRITGYIIYYSTD-- -VNAEIHD WVIEPWGNRLTHQIQELTLDTPYYFK 1062 
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Qy 1170 CPPENVYSNPLSAVAGGTQNSYQITPTNQHPPQLP--AYFATTGPGGAVP 1217 

I I : :: I I II ::| : I I :| 
Db 1063 IQARN- - SKGMGPMSEAVQFE — TPKADSSDKMPNDQALGSAGKGSRLPDLGSDYKPPM 1117 

Qy 1218 PNHLP FATQRHAASEYQAGLNAARCA 1243 

1:1 I 1:1 : : : 
Db 1118 SGSNSPHGSPTSPLDSNMLLVnVSVGVITIVVVVVIAVFCTRRTTSHQKK 1168 

Qy 1244 QSRACNS CDALATPS PMQP— -PPPVPVPEGWYQPVHPNSH 1281 

: II I 1:1 ::! I I II I: II 

Db 1169 KRAACKSVNGSHKYKGNCKDVKPPDLWIHHERLELKPIDKSPDPNPVMTD--TPIPRNSQ 1226 



Qy 1282 

: I 

Db 1227 DITPV- 



1338 

hi I hi: III I I :: I 
- -DNSMDSNIHQRRNSYRGHESEDSMSTLAGRRGMRPKMMMP 1271 



RESULT 3 
NEOIJUMAN 

ID NEOIJUMAN STANDARD; PRT; 1461 AA. 

AC Q92859; 000340; 

A Ol-OCT-2000 (tel. 40, Created) 

V Ol-OCT-2000 (tel. 40, Last sequence update) 

TT 01-OCT-20QQ (tel. 40, Last annotation update) 

DE NEOGENIN PRECDRSOR. 

GN NEOl OR NGN. ■ 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

RN [1] 

RP SEQUENCE FROM H.A. (ISOFORMS 1 AND 2) . 

RC TISSUE-FETAL BRAIN; 

RX MEDLINE-97236653; PubMed-9121761; 

RA Meyerhardt J. A,, Look A.T., Bigner S.H., Fearon E.R.; 

rt "identification and characterization of neogenin, a DCC-related 

RT gene."; 

RL Oncogene 14:1129-1136(1997). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORMS 1 AND 2). 

RC TISSUE-FETAL BRAI|; 

RX MEDLINE-97312699; 'PubMed-9169140; 

ra vielmetter J,, Chen x.-N., Miskevich F., Lane R.P., Yamakawa K., 

RA Korenberg J.R., Dreyer W.J.; 

RT "Molecular characterization of human neogenin, a DCC-related protein, 

RT and the mapping of its gene (NEOl) to chromosomal position 15q22 . 3 - 

RT q23, n ; 

RL Genomics 41:414-421(1997). 

CC -!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
CC TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 

•DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 
-!- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

CC -!- ALTERNATIVE PRODUCTS: AT LEAST 2 ISOFORMS; 1 (SHOWN HERE) AND 2; 
CC ARE PRODUCED BY ALTERNATIVE SPLICING. 

CC -!- TISSUE SPECIFICITY : WIDELY EXPRESSED AND ALSO IN CANCER CELL 
CC LINES. 

CC -!- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN -LIKE C2-TYPE DOMAINS. 

CC •!• SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LKE DOMAINS . 

CC -!- SIMILARITY; BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
CC TUMOR SUPPRESSOR PROTEIN DCC. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation • 

CC the European Bioinformatics Institute, There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to licenseSisb-sib.ch). 

cc 

DR EMBL; U61262; AAB17263.1; -. 

DR EMBL; 072391; AAC51287.1; -. 

DR MIM; 601907; -. 

DR HSSP; P02751; 1TTG. 



DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 6. 

DR PFAM; PF00047; ig; 4. 

DR PRINTS; PR00014; FNTYPEIII. 

KW Transmembrane; immunoglobulin domain; Glycoprotein; Signal; 

KW Alternative splicing. 
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1105 


EXTRACELLULAR (POTENTIAL). 
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TRANSMEM 


1106 


1126 


POTENTIAL. 
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DOMAIN 


1127 


1461 


CYTOPLASMIC (POTENTIAL). 
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DOMAIN 


67, 


136 


IG-LIKE C2-TYPE DOMAIN 
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DOMAIN 


166 ■ 


228 


IG-LIKE C2-TYPE DOMAIN 
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DOMAIN 


263' 


327 


IG-LIKE C2-TYPE DOMAIN 
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DOMAIN 


355- 


417 


IG-LIKE C2-TYPE DOMAIN 
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DOMAIN 


436 


533 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


536: 


629 


FIBRONECTIN TYPE-III. 
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DOMAIN 


630' 


729 


FIBRONECTIN TYPE-III. 
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DOMAIN 


735' 


829 


FIBRONECTIN TYPE-III. 
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DOMAIN 


850.' 


950 


FIBRONECTIN TYPE-III. 
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DOMAIN 


95L' 


1052 


FIBRONECTIN TYPE-III. 
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DOMAIN 


1118.'' 


1121 


POLY-VAL. 




FT 


DISULFID 


74' 


129 


BY SIMILARITY. 
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DISULFID 


173 


221 


BY SIMILARITY. 




FT 


DISULFID 


270 


320 


BY SIMILARITY . 




FT 


DISULFID 


362.. 


410 


BY SIMILARITY. 
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CARBOHYD 


73' 


73 


N-LINKED (GLCNAC. . .) 


(POTENTIAL), 


FT 


CARBOHYD 


210: 


210 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


326 


326 


N-LINKED (GLCNAC . . .) 


(POTENTIAL). 


FT 


CARBOHYD 


470': 


470 


N-LINKED (GLCNAC, . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


489 < 


489 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


639;' 


639 


N-LINKED (GLCNAC. . .) 


(POTENTIAL), 


FT 


CARBOHYD 


715' 


715 


N-LINKED (GLCNAC. . .) 


(POTENTIAL), 


FT 


CARBOHYD 


909 1 


909 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) , 


FT 


VARSPLIC 


1248 < 


1300 


MISSING (IN ISOFORM 2) 




FT 


CONFLICT 


168'. 


168 


G ■> N (IN REF. 2). 




SQ 


SEQUENCE 


1461 AA; 159958 MW; 7AAE897E69635A21 CRC64 ; 



Query Match 9.6%; Score 710.5; DB 1; Length 1461; 

Best Local Similarity 22 .3%; Pred. No. 3.1e-28; 

Matches 313; Conservative 184; Mismatches 564; Indels 341; Gaps 48; 

Qy 157 FRVEPKDTRVAKGETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGG 216 

I III II .:! : =1 I I I : I III :: : I :;; I 

Db 55 FLVEPVDTLSVRGSSVILNCS-AYSEPSPKIEWKKDGT FLNLVSDDRRQLLPDG 107 

Qy 217 NLLISNV- — "EPIDEGNYKCIAQ-NLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYG 269 

:| Nil ' :| III hhl :ll I lllll I I :h : I 
Db 108 SLFISNWHSKHNKP-DEGYYQCVATVESLGTIISRTAKLIVAGLPRFTSQPEPSSVYAG 166 

Qy 270 QTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDEK SLEISNITPTDEGTY 322 

I :|| 1 II h: :l :l I:: I III I I I I 

Db 167 NGAILNCEVNADLVPFVRWEQ NRQPLLLDDRVIKLPSGMLVISNATEGDGGLY 219 

Qy 323 VCEAHN-NVGOISARASLIVHAPPN FTKRPSNKKVGLNGVVQLPCMASGNPPPS 375 

I : :. I III I 1:11 : Nihil! I h 
Db 220 RCWESGGPPKYSDEVELKVLPDPEVISDLVFLKQPSPLVRVIGQDWLPCVASGLPTPT 279 

Qy 376 VFWTK--EGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVF 433 

: I I I : I I I : I hhhll ::| I I I : h : 
Db 280 IKWMKNEEALDT ESSERLVLLAGGSLEISDVTEDDAGTYFC IADNGNETIE 330 

Qy 434 LQVSSVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFHDGHAVQAGNRYSI 493 

I : I :: II : I II hi :|| :| I : : I 
Db 331 AQAELTVQAQPEFLK-QPTNIYAHESMDIVFECEVTGKPTPTVKWVKNGDMVIPSDYFKI 389 

Qy 494 IQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTSLHRAADPSTYPAPPG 553 

:: :hl \ 1 1 I I I I : I II:: II hi 
Db 390 VKEHNLQVLGLVXSDEGFYQCIAENDVGNAQAGAQLIILE HAPATTGPLPSAPR 443 

Qy 554 TPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQVTIS 613 



Mon Jan 22 13:04:19 2001 



us-09-540-245a-15.rsp 



Page 5 



II I I I I : !:! I : : |: II 

Db 444 DWASLVSTRFIKLTWRTPASDPH--GDNLTYSVFYTKEGIARERVENTSHPGEMQVTIQ 501 

Qy 614 GLT PGTS YVFLVRAENTQGISVPSGLSNVIKT IEADFDAASANDLSAART LLTGKS VEL • 672 

I I I 1:1 I 1:1 I II |: :| I hi 

Db 502 NLMPATVY IFRVMAQNKHG — SGESSAPLRVE TQPEVQLP 539 

Qy 673 IDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDASVPSAQYHSITVMDASAES 727 

: I I : ::: : I II : : :: :::| : I :| |: I 
Db 540 GPAPNLRAYAASPTSITVTWETPVSGNGE-IQNYKLYYMEKGTDKEQ DVDVSSHS 593 

Qy 728 FWGNLKKYTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVR 787 

: : llllhl I : : : I : I Mill I I: : : I : : 
Db 594 YT I NGLKKYTEYSFRWAYNKHGPGVSTPDVAVRTLSDVPSAAPQNLSLEVRNSKS IMIH 653 

Qy 788 WTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFT 847 
III II : INI : : :: I |<|: I I |: |: : | 

1654 WQPPAPATQNGQITGYKIRYRKASRKSDVTETLVSGTQLSQLIEGLDRGTEYNFRVAALT 713 

848 KAGDGPYSKPIS • -LFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTT 905 

I II : :| I I II I : I 

Db 714 INGTGPATDWLSAETFESDLDETRVPEV-PSSLH VRP" 749 

Qy 906 HKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTK--ELGHLSWSDNEITA 963 

II I :| : :: : : :|: 
Db 750 LVTSIWSWTPPENQNIWRGYAIGY G 776 

Qy 964 LNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGG 1013 

: ::: :|: : : | : | || : :|| | 

Db 777 IGS PHAQT IKVDYKQRYYTI ENLDPS SHYVITLKAFNNVGEGIPLYESAVTR 828 

Qy 1014 - -TDYAEVDTRNLTTFYNCRKSPDNPTPYATTMI • ■ IGTSSSETCTKTTSIS ■ ADKDSGT 1068 

II :lll : :l I I I I: :| :| I |: II 

Db 829 PHTDTSEVDLFVI NAPYTPVPDPTPMMPPVGVQASILSHDTIRITWADNSLPK 881 

Qy 1069 HSPYSDA FAGQVPAVPWKS NYLQYPVEPINWSEF 1103 

I :i: : :|l I: :ll ::l II 
Db 882 HQKITDSRYYTVRWKTNIPANTKYKNANATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTW 941 

Qy 1104 LPPPPEHPPPSSTYGYAQGSPES SRKSSKSAGSGI--STNQSIL 1145 

I II I :| I" I : I I I II: : 

Db 942 SMTAHGTTFELVPTSPPKDVTWSKEGKPKTIIVNWQPPSEANGKITGYIIYYSID— V 998 



Qy 1146 NASIH-- 
II II 



t 



-SSSSGGFSAWGVSPQYAV-ACPPEN 1174 

:| I : I: 



999 NAEIHDWVIEPWGNRLTHQIQELTLDTPYYFKIQARNSKGMGPMSEAVQFRTPKADSSD 1058 

1175 VYSNPLSAVAGGTQNRYQITPTNQHPPQLPAYFATTGPGGAVPPNHL 1221 

I :: :H :| :: II : I : I I 
1059 KMPNDQASGSGGKGSRLPDLGSDYKPPMSGSNSPHGSPTSPLDSNMLLVIIVSVGVITIV 1118 



Qy 1222 PFATQRHAASEYQAGLNAARCAOSRACNSCDA — LATPSPMQPP 1263 

I 1:1 : : : : II I : "11 

Db 1119 VWIIAVFCTRRTTSHQKK KRAACKSVNGSHKYKGNSKDVKPPDLWIHHER 1169 

Qy 1264 PPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSECSDHSRSSQSHKRQLQL 1315 

III 1:11:1 hi I 1:1: 

Db 1170 LELKPIDKSPDPNPIMTDTPIPRNSQDITPV DNSMDSNI HQRRNS Y 1215 

Qy 1316 EEHGSS AKQRGGHHRRRAPV — VQPCMESENENMLAEYEQRQYTSDCCNSSR 1365 

II Ml : I II : : : I ::| : :| 

Db 1216 RGHESEDSMSTMRGMRPKMMMPFDSQPPQPVISAHPIHSLDNPHHHFHSSSLASPAR 1275 

Qy 1366 EGDTCSCSEGSCLYAEAGEPAP 1387 

III III 
Db 1276 SHLY-HPGSPWP 1286 



RESULT 4 
NEOl.CHICK 

ID NEOl.CHICK STANDARD; PI 
AC Q90610; 

DT 01-OCT-2000 (Rel. 40, Created) 



01-OCT-2000 (Rel, 40, Last sequence update) 
01-OCT-2000 (Rel, 40, Last annotation update) 
NEOGENIN (FRAGMENT). 
Gallus gallus (Chicken) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Archosauria; Aves; Neognathae; Galliforraes; Phasianidae; Phasianinae; 
Gallus. 
[1] 

SEQUENCE FROM N.A. 

STRAIN-WHITE LEGHORN; TISSUE-EMBRYONIC BRAIN; 

MEDLINE-95105243; PubMed=7806578; 

Vielmetter J., Roman J.M., Dreyer W.J.; 

"Neogenin, an avian cell surface protein expressed during terminal 

neuronal differentiation, is closely related to the human tumor 

suppressor' molecule deleted in colorectal cancer."; 

J. Cell Biol. 127:2009-2020(1994), 

■!■ FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

-I- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

■!• DEVELOPMENTAL STAGE: IN RETINA, EXPRESSED ON GANGLION CELL FIBERS 
AS SOON AS THEY BEGIN TO EXTEND THEIR AXONS. 

•!■ SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

■!- SIMILARITY:. CONTAINS 6 FIBRONECTIN TYPE III -LIKE DOMAINS. 

-!- SIMILARITY:' BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG , TO 
TUMOR SUPPRESSOR PROTEIN DCC. 

This SWISS-PROT. entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation • 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; U07644; AAC59662.1; -. 
HSSP; P80362; 1WTL. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fri3; 6, 
PFAM; PF00047; ig; 4. 

Transmembrane; Immunoglobulin domain; Glycoprotein. 



NONJER 
DOMAIN 
TRANSMEM 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DISULFID 
DISULFID 
DISULFID 
DISULFID 
CARBOHYD 
CARBOHYD 
CARBOHYD 



CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 



1 

<1 
1091- 

1112:' 

3? 

132; 

229 
321' 
422' 
522, 
616 
720 
835' 
936 

40 
139. 
236, 
328 

39 
176. 
292 
456 
475 
625 
700 
894 
1443 AA 



1 

1090 
1111 
1443 
102 
194 
293 
383 
519 
615 
714 
814 
935 
1037 
95 
187 
286 
376 
39 
176 
292 
456 
475 
625 
700 
894 



EXTRACELLULAR (POTENTIAL) 
POTENTIAL. 

CYTOPLASMIC (POTENTIAL). 
IG-LIRE C2-TYPE DOMAIN, 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
FIBRONECTIN TYPE-III, 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
BY SIMILARITY . 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
N- LINKED (GLCNAC, 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC 



(POTENTIAL). 
(POTENTIAL) , 
) (POTENTIAL), 
) (POTENTIAL) , 
) (POTENTIAL). 
) (POTENTIAL). 
) (POTENTIAL). 
) (POTENTIAL), 



Query Match 



158050 MW; 558C6795579C0E26 CRC64; 



9.1%; Score 677; DB 1; Length 1443; 
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Best Local Similarity 23.1%; Pred. No. 1.5e-26; 

Matches 304; Conservative 171; Mismatches 541; Indels 302; Gaps 49; 

Qy *157 FRVEPRDTRVARGETALLECG PPRGIPEPTLIWIRDGVPLDDLRAMSFGASSRVR 211 

I III I if: ■■ I III : I III :: : I : 

Db 21 FLVEPMDILSVRGASVIMNCSSYCETPPK IEWRKDGT LLNLVSDDRRQ 68 

Qy 212 IVDGGNLLISNV EPIDEGNYKCIAQ - MLVGTRESSYAKLIVQVKPYFMKEPKDQ 264 

Mil::! :l III M'-l M I III I II :|: 
Db 69 LLPDGSLLINSWHSKHNKP-DEGYYQCVATVESLGSIVSRTMLTVAGLPRFTSQPELS 127 

Qy 265 VMLYGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDEKSLEISNITPTDEGTYVC 324 

: I :] :| I I I |::: : : :| I I I !l I I I 

Db 128 SVYKGNSAILNCEVNVDLAPFVRWEQDRQPLSLDDRVFKLPSGALLIGNATDTDGGFYRC 187 

Qy 325 EAHN - NVGQISARASLIVHAPPN FTKRPSN- -RRVGLNGWQLPCMASGNPPPS 375 

: : I I I : I I ::||: I I I I Ihl I I I 
Db 188 VIESGGTPKYSEEAELKILPDPEEPQSLVFVRQPSSLTKVTGQNAV- - FPCVAGGFPTPY 245 

Qy 376 VFWTREGVSTLMFPNSSHGRQYVAACGTLQITDVRQEDEGYYVCSAFSVVDSSTVRVFLQ 435 
Mill: : I 1:1 |:|| :|| I I I : |: : I 

(b 246 VRWTKNGEELI — TEDSERFALRAGGSLliISDVTEEDVGTYTC — IADNENETIEAQ 298 

y 436 VSSVDERPPPIIQIGPMQTLPKGSVATLPCRATGNPSPRIRWFHDGHAVQAGNRYSIIQ 495 

i: II :: III : I II |:| :| I :| I : : |:: 

Db 299 AELA^QVPPEFLK-RPANIYAHESMDIVFECEVTGKPTPTVKWVKNGDVVIPSDYFKIVK 357 

Qy 496 GSSLRVDDLQLSDSGTYTCTASGERGETSWAATL TVERPGSTSLHRAAD — 544 

I II I II I : I II : MM: 

Db 358 EHNLQVLGLVKSDEGFYQCIAENDVGNAQAGAQLIILDLDVAIPTLPPTSLTSATNDHLA 417 

Qy 545 PSTYPAPPGTPK- - -VLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVA 601 

MM: || | || | | ; |:: I : : 
Db 418 PATTGPLPTAPRDWATLVSTRFIRLTWRTPVSDP--QGDNLTYSIFYTKEGINRERVEN 475 

Qy 602 AHRVGDTQVTISGLTPGTSYWLVRAENTQGISVPSGLSNVIRTIEADFDAASANDLSAA 661 

I hill II II MM hll I I I M 
Db 476 TSRPGETQVMIQNLMPETVYVFRWAQNKHGHGESSAPLKVATQPEVQLPGPAPN 530 

Qy 662 RTLLTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDASVPSAQYHSITVM 721 

I II : ::M I =M : :: :::| : I I : 
Db 531 IRAYAGSPTSVTVTWETPLSGNGE-IQNYRLYYMEKGQDSEQ DV 573 

Qy 722 DASAESFWGNLRRYTKYEFFLTPFFETIEGQPSNSRTALTYEDVPSAPPDNIQIGMYNQ 781 

Ml:: HUM M : : I : I Mil I |: : I 
Db 574 DVAGLSYTITGLKKYTEYSFRWAYNKHGPGVSTQDVWRTLSDVPSAAPQNLTLEARNS 633 



f 



782 TAGWVRWPPPSQHHNGNLYGYKI---EVSAGNTMKVLWWTLNATTTSVLLNNLTTGAV 838 

: : I III: |:| : Mil :ll I :: I I: I I 
634 KS IMLHWQPPPAGTHSGQITG YKIRYRRVS — RRSDVTESVGGTQLFQLIEGLERGTE 689 

839 YSVRLNSFTKAGDGPYSKPIS--LFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNI 896 

I: I: : I I IM :| I I II I 
690 YNFRIAAMTVNGTGPATDWVSAETFESDLDESRVPEV-PSSLH 731 

897 PPGDINPTTHKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFRRKHQMTR-ELGHLS 954 

: I II I :] : :: : : :|: 

732 — -VRP LVTSIWSWTPPENQNIWRGYAIGY" 760 

955 WSDNEITALNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGG- 1013 

: ::: :|: : : M I II : :|| I 

761 G IGS PHAQT IKVDYKQRYYT IENLDPS SHYVITLRAFNNVGEGI 804 

1014 TDYAEVDTRNLTTFYNCRKS PDNPT PYATTMI - - IGTSSSETCTKTTSI 1060 

:| :HI : :| I | : |: :| : II 

805 PLYESAVTRPHSDTSEVDLFVI NAPYTPVPDPSPMMPPVGVQASILSHDTIRI 857 

1061 S-ADKDSGTHSPYSDA FAGQVPAVPWKS NYLQYPVEPINWSEF — 1103 

: II : M : :|l I: :|l ::| II 
858 TWADNSLPRNQRnDARYYTVRWRTNIPANTKYRTANATTLSYLVTGLKPNTLYEFSVMV 917 



--LPPPPEHPPPSSTYGYAQGSPES" 

I II I ill: 



■-SRKSSKSAGSGI 1138 

I : I I I 



Db 918 TKGRRSSTWSMTAHGTTFELVPTSPPKDVTWSKEGKPRTIIVNWQPPSEANGKITGYII 977 

Qy 1139 "STNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPT 1196 

II: Ml Ml III: :M 
Db 978 YYSTD — VNAEIHD WVIEP WGNRLT HQIQEL 1007 

Qy 1197 NQHPPQLPAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALAT 1256 

I I I MM :|: : :| | : 

Db 1008 TLDTPYYFKIQARNSKGMGPMSEAVQFRTPKAESSD KMPNDQASGSAGKGSR 1059 

Qy 1257 PSPMQP- - PPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSE CS 1300 

I : I M I: I I II : I h 

Db 1060 PVDVGPDYRPPLSGSNS PHGSPTSPLDSNMLLVIIVSVGVITIVIWIVAVFCT 1113 

Qy 1301 DHSRSSQSHKRQLQLEEHGSSARQRGG HHRR RAPWQPCM 1340 

Mill Ml I :| II I ::| I I 

Db 1114 RRTTSHQRRRRAACRSVNGSH-RYKGNSRDVRPPDLWIHHERLELRPIDRSPDPNPIM 1170 



RESULT 5 

DSCAJ1UMAN ; 

ID DSCAJUMAN ' jSTANDARD; PRT; 2012 AA, 

AC 060469; 060468; ' 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rei. 40, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE DOWN SYNDROME CELL ADHESION MOLECULE PRECURSOR (CHD2) , 

GN DSCAM. ( 

OS Homo sapiens (Hitman). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 

RN [1] 

RP SEQUENCE FROM N.A., AND ALTERNATIVE SPLICING. 

RC TISSUE-BRAIN; ■. 

RX MEDLINE-98087574; PubMed-9426258; 

RA Yamakawa K., Hupt Y.-R., Haendelt M.A., Hubert R. , Chen X.-N., 

RA Lyons G.E., Rorenberg J.R.; 

RT "DSCAM: a novel member of the immunoglobulin superfamily maps in a 

RT Down syndrome region and is involved in the development of the 

RT nervous system/; 

RL Hum. Mol. Genet. 7:227-237(1998). 

RN [2] 

RP SEQUENCE FROM N.'A. 

RX MEDLINE-20289799; PubMed-10830953; 

RA Hattori M., Fujiyama A., Taylor T.D., Watanabe H., Yada T., 

RA Park H.-S., Toyoda A., Ishii K., Totoki Y., Choi d.-r., Soeda E., 

RA Ohki M. ( Takagi'T,, Sakaki Y, , Taudien S., Blechschmidt K., Polley a., 

RA Menzel U., Delabar J., Kumpf K. , Lehmann R. , Patterson D., 

RA Reichwald K., Rump A., Schillhabel M., Schudy A,, Zimmermann w., 

RA Rosenthal A., Kudoh J., Shibuya K., Rawasaki R., Asakawa S., 

RA Shintani A., Sasaki T., Nagamine R, , Mitsuyama S,, Antonarakis S.E., 

RA Minoshima S., Shimizu N., Nordsiek G., Hornischer K., Brandt P., 

RA Scharfe M., Schoen O., Desario A., Reichelt J., Rauer G., Bloecker H., 

RA Ramser J., Beck A., Klages S., Hennig S., Riesselmann L., Dagand E., 

RA Wehrmeyer S., Borzym R. , Gardiner R., Nizetic D., Francis F., 

RA Lehrach H,, Reinhardt R., Yaspo M.-L.; 

RT "The DNA sequence of human chromosome 21."; 

RL Nature 405:311:319(2000). 

RN [3] 

RP FUNCTION. 

RA Agarwala K.L., Hakamura S., Tsutsumi Y., Yamakawa R.; 

RT "Down syndrome cell adhesion molecule DSCAM mediates homophilic 

RT intercellular adhesion."; 

RL Brain Res. Mol. Brain Res. 79:118-126(2000). 

CC -I- FUNCTION: CELL ADHESION MOLECULE THAT CAN MEDIATE CATION- 

CC INDEPENDENT . HOMOPHILIC BINDING ACTIVITY. COULD BE INVOLVED IN 

CC NERVOUS SYSTEM DEVELOPMENT. 

CC -!- SUBCELLULAR • LOCATION : TYPE I MEMBRANE PROTEIN (PROBABLE). THE 
CC SHORT ISOFORM MAY BE SECRETED. 

CC -!- ALTERNATIVE PRODUCTS: 2 ISOFORMS; A LONG FORM/CHD2-52 (SHOWN HERE) 
CC AND A SHORT FORM/CHD2-42; ARE PRODUCED BY ALTERNATIVE SPLICING. 

CC -I- TISSUE SPECIFICITY: PRIMARILY EXPRESSED IN BRAIN. 

CC -I- SIMILARITY: 'CONTAINS 10 IMMUNOGLOBULIN-LIRE C2-TYPE DOMAINS. 
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cc 


-!- SIMILARITY; CONTAINS 6 


FIBRONECTIN TYPE III-LIKE DOMAINS. 


FT 


ISOFORM) . 


cc 










FT 


VARSPLIC 1572.2012 MISSING (IN SHORT ISOFORM). 


cc 


This SWISS-PROT entry is copyright. It is produced through a collaboration 


FT 


CONFLICT 1893.. 2012 HRPGDLIHLPPYLRMDFLLNRGGPGTSRDLSLGQACLEPQK 


cc 


between 


the Swiss Institute of Bio informatics and the EMBL outstation ■ 


FT 


SRTLKRPTVLEPIPMEAASSASSTREGQSWQPGAVATLPQR 


cc 


the Europ 


ean Bioinformatics Institute. There are no restrictions on its 


FT 


EGAELGQAAKMSSSQESLLDSRGHLKGNNPYAKSYTLV -> 


cc 


use by 


non-profit institutions as long as its content is in no way 


FT 


IGQVTS Y ICLHTLEWTFC (IN REF. 1). 


cc 


modified 


and this statement is not removed. Osage by and for commercial 


SQ 


SEQUENCE 2012' AA; 222259 MW; 0E3 3CFB781A08334 CRC64; 


cc 


entities requires a license agreement (See http://www.isb-sib.ch/announce/ 






cc 


or send an email to license@isb-sib.ch), 






cc 










Query Match 8.9%; Score 661; DB 1; Length 2012; 


DR 


EMBL; AF023450; AAC17967.1; 




Best Local Similarity 24,24; Pred, No. 1.4e-25; 


DR 


EMBL; AF023449; AAC17966.1; 




Matches 305; Conservative 175; Mismatches 473; Indels 308; Gaps 


DR 


EMBL; AL163283; CAB90464.1; 








DR 


EMBL; AL163282; CAB90436.1; 




Qy 


55 SPRIIEHPTDLyVKKNEPATLNCKVEGKPEPTIEWFKDGEPVSTNEKKSHRVQ---FKDG 111 


DR 


EMBL; AL163281; CAB90444.1; 






: 1 : 1 1 : 1 1 1 1 : M 1 : II 1 1 1 1 1 : 1 : : III: : 1 


DR 


MIM; 602523; -. 






Db 


406 TPKIISAFSEKWSPAEPVSLMCNVKGTPLPTITWTLDDDPIL--KGGSHRISQMITSEG 453 


DR 


INTERPRO 


IPR001777; -. 








1 


INTERPRO; IPR003006; -. 




Qy 


112 ALFFYRTMQGKKEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEP-KDTRVAKGE 170 


1 


PFAM; PF00041; fn3; 6. 






: 1 : ■ ' :| 1 1 1 1 1 1 1 1 : : 1 : 1 |: 1 




PFAM; PF00Q47; ig; 9. 




Db 


464 NWSYLNISSSQVRDGGVYRCTANNSAG-WLYQARINV---RGPASIRPMKNITAIAGR 519 


DR 


PRINTS; PR00014; FNTYPEIII. 








KW 


Immunoglobulin domain; Glycoprotein; Signal; Cell adhesion; Repeat; 


Qy 


171 TALLECGPPK3IPEPTLIWIKDG--VPLDDLKAMSFGASSRVRIVDGGNLLISNVE-PID 227 


KW 


Transmembrane; Alternative: splicing. 




: 1 | | :: | |: :| : :| : | | :|:|: :| 


FT 


SIGNAL 


1 


17 


POTENTIAL. 


Db 


520 DTYIHC-RVIGYPYYSIKWYKNSNLLPFN HRQVAFENNGTLKLSDVQKEVD 569 


FT 


CHAIN 


18 


2012 \ 


DOWN SYNDROME CELL ADHESION MOLECULE . 






FT 


DOMAIN 


18 


1595 


EXTRACELLULAR (POTENTIAL) . 


Qy 


228 egnykc--iaq'nlvgtressyaklivqvkpyf--mkepkdqvmlygqtatfhc-svggdp 282 


FT 


TRANSMEM 


1596 


1616 


POTENTIAL. 




II 1 1 : 1 : 1 :| : : 1:1 I: : |: : || Mil 


FT 


DOMAIN 


1617 


2012 


CYTOPLASMIC (POTENTIAL). 


Db 


570 EGEYTCNVLVQPQLSTSQSVH- -VTVKVPPFIQPFEFPRFSI- - -GQRVFIPCVWSGDL 624 


FT 


DOMAIN 


39 


109 


IG-LIKE C2-TYPE DOMAIN. 






FT 


DOMAIN 


138 


204 


IG-LIKE C2-TYPE DOMAIN. 


Qy 


283 PPKVLWKKEEGNIPVSRARILHD- ■ -EKSLEISNITPTDEGTYVCEAHNNVGQISARASL 339 


FT 


DOMAIN 


239 


300 


IG-LIKE C2-TYPE DOMAIN. 


1 : hi: II 1 : : II III:: Mill : :: 1 


FT 


DOMAIN 


328 


392 


IG-LIKE C2-TYPE DOMAIN. 


Db 


625 PITITWQKDGRPIPGSLGVTIDNIDFTSSLRISNLSLMHNGNYTCIARNEAAAVEHQSQL 684 


FT 


DOMAIN 


421 


491 


IG-LIKE C2-TYPE DOMAIN. 






FT 


DOMAIN 


518 


582 


IG-LIKE C2-TYPE DOMAIN, 


Qy 


340 IVHAPPNFTKRPSNKKVGLNG-WQLPCMASGNPPPSVFWT-KEGVSTLMF-PNSSHGRQ 396 


FT 


DOMAIN 


610 


676 


IG-LIKE C2-TYPE DOMAIN, 




II II 1 :J hi 1 1 1 1 1 1 h: 1 :l 1 1 : :|| 


FT 


DOMAIN 


704 


773 


IG-LIKE C2-TYPE DOMAIN. 


Db 


685 IVRVPPKFWQPRDQD-GIYGKAVILNCSAEGYPVPTIVWKFSKGAGVPQFQPIALNGRI 743 


FT 


DOMAIN 


802 


872 


tp.ttpp n^-Tvnr haurtxt 
IG LIKE Li il?L DOMAIN. 






FT 


UUMA1N 




972 


rIBKUNLOlIN lift-Ill. 


Qy 


397 YVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVR-VFLQVSSVDERPPPIIQIGPANQT 455 


FT 


DOMAIN 


984 


1076 


tlBRUNijLIIN liVb'ill, 




i . . i . i i i . 1 1 1 1 1 . i . i . ... i i .i.i t 
1 ::hl 1 • 1 :ll llhl : 1 : : ::| 1 : 1 :| 1 


FT 


IYUISTM 
UUMAIW 






r IBKUNliU IN lift 111, 


Db 


744 QVLSNGSLLIKHWEEDSGYYLCKVSNDVGADVSKSMYLTV KIPAMITSYPNTTL 798 


FT 


!)OMAIN 


1189 


1273 


FIBRONECTIN TYPE- III, 






FT 


DOMAIN 


1300 


^1366 


IG-LIKE C2-TYPE DOMAIN. 


Qy 


456 LPKGSVATLP<»RATGNPSPRIKWFHDGHAVQAG-NRYSIIQG SSLRVDDLQLS 507 


FT 


DOMAIN 


1380 


'1463 


FIBRONECTIN TYPE-III. 




: I : | | | : : | : : 1 1 : | : | ; : 


FT 


DOMAIN 


1477 


1562 


FIBRONECTIN TYPE-III. 


| Db 


/yj AlUoUMfiMb'.IMWiKrIIVKWbMDKllNra oDo 


FT 


DISDLFID 


46 


102 


BY SIMILARITY. 


1 






DISULFID 


145 


197 


BY SIMILARITY. 


i Qy 


5UB UouinLiAabbMMbWMlLlVblUrbMoLH^ 30/ 


1 


DISDLFID 


246 


293 


BY SIMILARITY. 




1 1 1 • ■ 1 1 II 1 1 1 • • I III • • • 1 • 1 • 1 
1 1 1 • • 1 1 1 1 III") III • ■ ■ 1 ■ 1 • 1 




DISDLFID 


335 


385 


BY SIMILARITY. 


Db 


0J3 USur roLnftlJioIUiUKbliyijl VU.&r rUrrtil MMJVnnnlllL jUj 




DISDLFID 


428 


484 


BY SIMILARITY, 






™ 


DISDLFID 


525 


575 


BY SIMILARITY. 


uy 


SfiP BWAKQOPIfPfi&VRPTTfiVTVPVI'QPnT/ITrWTVaAHBVnr) TOVTKPTTPrT^YV 07 


FT 


DISDLFID 


617 


669 


BY SIMILARITY. 




II • ' , 1 1 1 1 • 1 ■ 1 ■ 1 1 1 1 1 ■ 1 • ■ 1 
II ■ ' 1 1 1 1 > 1 > 1 • 1 1 1 1 1 ■ 1 ■ > 1 


FT 


DISDLFID 


711 


766 


BY SIMILARITY. 


Db 


904 RWTMGFD — GUSPITGYDIE" -CKNKSDSW-DSAQRTKDVSPQLNSATIIDIHPSSTYS 957 


FT 


DISDLFID 


809 


865 


BY SIMILARITY. 








DISDLFID 


1307 


1359 


BY SIMILARITY. 


Qy 


623 FLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSAARTLLTGKSVELIDASAINASA 682 


FT 


CARBOHYD 


28 


28 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




: hi II 11:11 II :|| 1 1 : h: : 




CARBOHYD 


78 


78 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Db 


958 IRMYAKNRIGKSEP— SNEL-TITAD-EAAP DGPPQE'VHLEPISSQS 1000 




CARBOHYD 


470 


470 


N-LINKED (GLCNAC. . .) (POTENTIAL), 








CARBOHYD 


487 


487 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


Qy 


683 VRLEWMLHVSADEKY VEGLRIHYKDASVPSAQYHS ITVMDASAES - - FWGNLKK 735 


PT 


CARBOHYD 


512 


512 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




:|: 1 . I':|: : | :| |:: | :| :| 1 :| : : II 1 


FT 


CARBOHYD 


556 


556 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Db 


1001 IRVTW — K&PKKHLQNGIIRGYQIGYREYSTGGNFQFNIISVDTSGDSEVYTLDNLNK 1056 




CARBOHYD 


658 


658 


N-LINKED (GLCNAC. . .) (POTENTIAL). 






p_ 


CARBOHYD 


666 


666 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Qy 


736 YTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQH 795 


pi 


CARBOHYD 


710 


710 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




:hl : III MM! Ihhl : : : |: : 


FT 


CARBOHYD 


748 


748 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Db 


1057 FTQYGLWQAiNRAGTGPSSQEIITTTLEDVPSYPPENVQAIATSPESISISWSTLSKEA 1116 


FT 


CARBOHYD 


795 


795 


N-LINKED (GLCNAC. , .) (POTENTIAL), 






FT 


CARBOHYD 


924 


924 


N-LINKED (GLCNAC. . ,) (POTENTIAL), 


Qy 


796 HNGNLYGYKISVSAGNTMKVLANMTLNATTT-SVLLNNLTTGAVYSVRLNSFTKAGDGP 853 


FT 


CARBOHYD 


1142 


1142 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




■ II 1 II 1 III h h 1 II::: :lhllll 


FT 


CARBOHYD 


1160 


1160 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Db 


1117 LNGILQGFRV-IYWANLMDGELGEIKNITTTQPSLELDGLEKYTNYSIQVLAFTRAGDGV 1175 


FT 


CARBOHYD 


1250 


1250 


N-LINKED (GLCNAC. . .) (POTENTIAL). 






FT 


CARBOHYD 


1271 


1271 


N-LINKED (GLCNAC. . ,) (POTENTIAL). 


Qy 


854 YSKPISLFMDPIHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTTHKKTTDYL 913 


FT 


CARBOHYD 


1341 


1341 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




h i i : 1 1 hi 


FT 


CARBOHYD 


1488 


1488 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Db 


1176 RSEQI - • FTRTKEDVPGP - - - PAG 1194 


FT 


VARSPLIC 


1562 


1571 


NFATLNYDGS ■> KEAARCKEFS (IN SHORT 







Best Available Copy 
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Oy 914 SGPWLMVLVCIVLLVLVISAAISMVYFR RKHQMTKELGHLSWSDNE • • 960 

I :|: III: II: : : :|:|: I 

Db 1195 VKAAMSASMVFVSWLPPLKLNGIIRKYTVFCSHPYPTVISEFEAS 1240 

Oy 961 ITALNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGG 1013 

I I: I : hi: I I I |:|; 
Db 1241 PDSFSYRIPNLSRNRQYSVWV VAVTSAGRG NSSEII 1276 

Qy 1014 T--DYAEVDTRNLT TFYNCRKSPDNPTPYATTM- - IIGTSSSETCTKT 1057 

I I: I II II: :|:| I II I 

Db 1277 TVEPLAKAPARILTFSGTVTTPWMKDIVLPC-KAVGDPSPAVKWMKDSNGTPSLVTIDGR 1335 

Oy 1058 TSISAD KDSGTHSPYSDAFAGQVPAVPTOSNYLQYPVEPINWSE 1102 

II :: :lll fl : :| II 
Db 1336 RSIFSNGSFIIRTVKAEDSGYYS CI ANN NWGSDEIIL 1372 

Oy 1103 -FLPPPPEHP PPSSTYGYAQGSPESSRKSSKSAGSGISTNQSILNASIHSS 1152 

: II: I I I : I I 

Db 1373 NLQVQVPPDQPRLTVSKTTSSSITLSWLPGD NGGSSIRGYILQYSEDNS 1421 

Oy 1153 SSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPTNQHPPQLPAYFATTGP 1212 
ft III I I I : II :: :| I || 
■ 1422 EQ WGSFP — ISPSERSYR- -LENLKCGTWYKFTLTAQN GVGP 1459 

Oy 1213 G 1213 

I 

Db 1460 G 1460 



AXOl. 

ID 

AC 

DT 

DT 

DT 

DE 

GN 

OS 

oc 
oc 



fl 6 
.RAT 

AXOIJAT STANDARD; PRT; 1040 AA. 
P22063; 

01-AOG-1991 (Rel. 19, Created) 
01-AO6-1991 (Rel. 19, Lastj sequence update) 
15-JUL-1999 (Rel. 38, Last] annotation update) 
AXONIN-1 PRECURSOR (AXONAl] GLYCOPROTEIN TAG-1) 
TAXI. 

Rattus norvegicus (Rat) 
Eukaryota; Metazoa; Chord! 
Mammalia; Eutheria; Rodenl 



[1] 



Craniata; Vertebrata; Euteleostomi; 
Sciurognathi; Muridae; Murinae; Rattus. 



RA 
RT 
RT 

f 

CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
DR 
DR 
DR 



SEQUENCE FROM N.A., AND SEQUENCE OF 31-41. 
TISSUE-SPINAL CORD; J 
MEDLINE°90199890; PubMed-2817872; 
Furley a.j. ( Morton S.B., Manalo D,, Karagogeos D., Dodd J., 
Jessell T.M.; ] 

"The axonal glycoprotein TftG-1 is an immunoglobulin superfamily 
member with neurite outgrowth-promoting activity,'; 
Cell 61:157-170(1990). t 

-I- FUNCTION: MAY PLAY A ROLE IN THE INITIAL GROWTH AND GUIDANCE OF 

AXONS. MAY BE INVOLVED IN CELL ADHESION. 
-I- SUBCELLULAR LOCATION: ATTACHED TO THE NEURONAL MEMBRANE BY A 

GPI -ANCHOR AND IS ALSO RELEASED FROM NEURONS. 
-!- TISSUE SPECIFICITY: IN NEURAL TISSUES IN EMBRYOS, AND IN ADULT 

BRAIN, SPINAL CORD AND CEREBELLUM. 
-!- DEVELOPMENTAL STAGE: TRANSIENTLY EXPRESSED ON A SUBSET OF AXONS 

IN THE DEVELOPING RAT NERVOUS SYSTEM. 
-!- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
-!- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics institute. There are no restrictions on its 
use by non-profit institutions as long as ^ts content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch) . 

EMBL; M31725; AAA42201.1; -. 
PIR; A34695; A34695. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; ■. 



DR 


PFAM; PF00041; fn3 


; 4. 




DR 


PFAM; PF00047; ig; 


6. 




KW 


Immunoglobulin domain; Glycoprotein; Signal; GPI -anchor; 


KW 


Cell adhesion; Repeat. 




FT 


SIGNAL 


1; 


30 




FT 


CHAIN 


31; 71015 


AXONIN-1. 


FT 


PROPEP 


?1016i 


1040 


REMOVED IN MATURE FORM. 


FT 


DOMAIN 


56: 


120 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


150/ 


218 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


256 


315 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


343; 


404 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


435 


497 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


525; 


596 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


608, 


614 


GLY/PRO-RICH. 


FT 


DOMAIN 


613' : 


708 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


716, 


811 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


818/ 


910 


FIBRONECTIN TYPE-III, 


FT 


DOMAIN 


911\ 


1005 


FIBRONECTIN TYPE-III. 


FT 


SITE 


796; 


798 


CELL ATTACHMENT SITE (POTENTIAL). 


FT 


CARBOHYD 


7ft-: 


78 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CARBOHYD 


200 


200 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


206 


206 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


463! 


463 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


479 


479 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


500 


500 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


527 


527 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


777' 


777 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


832 


832 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


920 


920 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


942 


942 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


SQ 


SEQUENCE 


104O-AA; 113042 MW; 6E707EF6614CB4FB CRC64; 



Query Match 



8.9%; Score 658; DB 1; Length 1040; 



Best Local Similarity 24.1%; Pred. No. 8.6e-26; 



Matches 


oy 

Db 


22 
5 


Qy 


75 


Db 


61 


oy 


135 


Db 


116 


Qy 


191 


Db 


176 


Qy 


247 


Db 


227 


Qy 


297 


Db 


284 


Qy 


357 


Db 


341 


Qy 


417 


Db 


395 


Qy 


475 


Db 


451 



:| -I : I : I : II 



■ -SNGLPAVRGQYQSPRIIEHPTDLWKK - - -NEPAT 74 
: I II I I III:: : I 



Ml: | : | | | | I I II I I hi 



I II II: I 



:| : I I 1 1 I 



III I: I III hi : : |: I : 
-QTTGNLYIARTNASDLGNYSCLATSHMDFSTKSVFSKF 226 



"QVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWKKEEGNIP 296 
:| I I : I II I I hi I:: hi :|:: 



h 1:1 I I I I : I 



I :| I I I I I: 



II I 



I I I 



:| :: h I 
-ARGGEISILCQPRAAPKA 450 



II I IN I I: 
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Oy 533 KPGSTSL HRAMPS— TY t 548 

II: I : II: I: 

Db 511 TKITIAPSSADINVGDNLTI^CHASHDPTMDLTFTWTLDDFPIDFDKPGGHYRRASAKET 570 

Qy 549 * PAPPGTPKVLNVSRTSISLRW 569 

I III I :: I:: I I 
Db 571 IGDLT I LNAHVRHGGKYTCMAQTWDGTSKEATVLVRGPPGPPGGVWRDIGDTTVQLSW 630 

Qy 570 AKSQEKPGAVGPIIGYTVEYPSPDLQTGWIVAAHRV — GDTQVT - ISGLTPGTSYVFLV 625 

:: : II II:: :| I : I |: : Mil III 

Db 631 SRGFDNH* ■ -SPIAKYTLQARTPPSGKWKQVRTNPVNIEGNAETAQVLGLMPWMDYEFRV 687 

Qy 626 RAENTQGISVPSGLSNVIKTIEADFDAASANDLSAARTLLTGKSVELIDASAINASAVRL 685 

III III I: hi II : : : II I III : 
Db 688 SASNILGTGEPSGPSSKIRTREA-VPSVAPSGLSGG— -GGAPGELI I 731 

Qy 686 EWMLHVSADEKrVEGLRIHYRDASVPSAQYHSITVMDASAESFWGN-LKKYTKYEFFL 743 

• I II : : :| I : : : I I 1 : 1 1 1 1 : : II :| : 

732 NW-TPVSREYQNGDGFGYLLSFRRQGSSSWQTARVPGADAQYFVYGNDSIQPYTPFEVKI 790 

Qy 744 TPFFETIEGQPSNSKTALTY"EDVPSAPPDNIQIGfflNQTAGWVRWTPPPSQHHNGNLY 801 

: I I III I I: I I : : > I I • I I II I 
Db 791 RSY-NRRGDGPESLTALVYSAEEEPRVAPAKVWAKGSSSSEMNVSM-EPVLQDMNGILL 847 

Qy 802 GYKIEV-SAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISL 860 

11:1 Ih : II : I I I : :: :|| II I 
Db 848 GYEIRYWKAGDNEAAADRVRTAGLDTSARVTGLNPNTKYHVTVRAYNRAGTGPASPSADA 907 

Qy 861 FMDPTHHVHPPRAHPSGT HDGRHEGQDLTY HNNGNI PPG 899 

Mill :: I : I I 
Db 908 MT VKPPPRRPPGNISWTFSSSSLSLKWDPWPLRNESTVTGYKMLYQN 955 

Qy 900 DI NPTT HKKTTDYLSG PWLMVLV 922 

|::|| I II::! 
Db 956 DLHPT • - - PTLHLTSKNWIEI PV 975 



RESULT 7 
AXOl.HUMAN 

ID AXOIJUMAN STANDARD; PRT; 1040 AA. 

AC Q02246; 

DT Ol-JtJL-1993 (Rel. 26, Created) 

DT 01-JOL-1993 (Rel. 26, Last sequence update) 

DT 15-JUL-1999 (Rel. 38, Last annotation update) 

DE AXONIN-1 PRECURSOR (AXONAL GLYCOPROTEIN TAG-1) (TRANSIENT AXONAL ' 
GLYCOPROTEIN 1). 
TAXI OR TAG1. 
Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Cranlata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 
RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE-BRAIN; 

RX MEDLINE-93145965; PubMed-8425542; 

RA Hasler T.H., Rader C, Stoeckli E.T., Zuellig R.A. , Sonderegger p.; 

RT "cdna cloning, structural features, and eucaryotic expression of 

RT human TAG-l/axonin-l.\ 

RL Eur. J. Biochem. 211:329-339(1993). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE-BRAIN; 

RX MEDLINE-94140354; PubMed-8307567; 

RA Tsiotra CP., Karagogeos D., Theodorakis K., Michaelidis M.T., 

RA Modi W.S., Furley J. A., Jessel M.T., Papamatheakis J.; 

RT "isolation of the cDNA and chromosomal localization of the gene 

RT (TAXI) encoding the human axonal glycoprotein TAG-1."; 

RL Genomics 18:562-567(1993). 

CC -I- FUNCTION: MAY PLAY A ROLE IN THE INITIAL GROWTH AND GUIDANCE OF 

CC AXONS. MAY BE INVOLVED IN CELL ADHESION; 

CC -!- SUBCELLULAR LOCATION: ATTACHED TO THE NEURONAL MEMBRANE BY A 

CC GPI-ANCHOR AND IS ALSO RELEASED FROM NEURONS. 

CC -!- SIMILARITY; CONTAINS 6 IMMUNOGLOBULIN'LIKE C2-TYPE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS. 



CC - 

CC This SWISS-PROT 'entry is copyright, It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation ■ 

CC the European Bioinformatics Institute, There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to licensedisb-sib.ch). 

CC 

DR EMBL; X68274; CAA48335.1; -. 

DR EMBL; X67734; CAA47963.1; 

DR PIR; S28830; S23830. 

DR MIM; 190197; -.- 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; 

DR PFAM; PF00041; fn3; 4. 

DR PFAM; PF00047; lg; 6. 

KW Immunoglobulin domain; Glycoprotein; Signal; GPI-anchor; 

KW Cell adhesion; Repeat, 



FT 


SIGNAL 


" ■ 1 


28 




FT 


CHAIN • 


29 


1012 


AXONIN-1, 


FT 


PROPEP 


1013 


1040 


REMOVED IN MATURE FORM (POTENTIAL). 


FT 


DOMAIN 


54 


118 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


148 


216 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


254 


313 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


341 


402 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


433, 


495 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


523 


594 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


606 


612 


GLY/PRO-RICH. 


FT 


DOMAIN 


611 


706 


FIBRONECTIN TYPE-III, 


FT 


DOMAIN 


714 


809 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


816 


908 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


917 


1003 


FIBRONECTIN TYPE-III. 


FT 


SITE 


794 


796 


CELL ATTACHMENT SITE (BY SIMILARITY) 


FT 


CARBOHYD 


76: 


76 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


198 


198 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


204 


204 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


461 


461 


N-LINKED (GLCNAC. . ,) (POTENTIAL). 


FT 


CARBOHYD 


477 


477 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


498 


498 


N-LINKED (GLCNAC. , ,) (POTENTIAL). 


FT 


CARBOHYD 


525 


525 


N-LINKED (GLCNAC. . ,) (POTENTIAL). 


FT 


CARBOHYD 


830 


830 


N-LINKED (GLCNAC. , ,) (POTENTIAL). 


FT 


CARBOHYD 


918' 


918 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


940 


940 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


LIPID 


1012 


1012 


GPI-ANCHOR (POTENTIAL). 


so 


SEQUENCE 


1040.' 


AA; 113393 MW; 254E78DD3C28EFB6 CRC64; 



Query Match 



8.7%; Score 644.5; DB 1; Length 1040; 



Best Local Similarity 23.7%; Pred. No. 4.1e-25; 



Matches 


Qy 


36 


Db 


11 


Qy 


87 


Db 


71 


Qy 


147 


Db 


126 


Qy 


203 


Db 


186 


Qy 


252 


Db 


237 


Qy 


309 



I :| 



I: I:: 



I I I I I I II I I 1:1 I II III I 



:| :l I II I : h 



• III I: I III |:| : : I: I :|:| : 
■ -QTTGNLYIARTNASDLGNYSCLATSHMDFSTKSVFSKFAQLNLAAEDTRL 236 



I : Mill 1:1 I:: |:| :|:: 



309 LEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPSNKKVGLNGWQLPCMA 368 
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hi ::: Hill 1 1 1 I : I : : : : 1 1 I I : I I : : : :: I I 
291 UJIPSVSFEDEGTYECEAENSKGRDTVQGRIIVQAQPEWLKVISDTEADIGSNLRWGCAA 350 

369 SGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAAD6TLQITDVRQEDEGYTVCSAPSVVDSS 428 

:| I hi I : I M I I I I I: : : III I I :: 

351 AGKPRPTVRWLRNGE PLASQNRVEVLA-GDLRFSKLSLEDSGMYQC— -VAENK 400 

429 TVRVFLQVSSVDERPPPI IQIGPANQTLP - -KGSVATLPCRATGNPSPRIKWFHDGHAVQ 486 

" : I :: | : :| :| :||: I : I : 
401 HGTIYASAELAVQALAPDFRLNPVRRLIPAARGGEILIPCQPRAAPKAWLWSKGTEILV 460 

487 AGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTV 531 

:| :l : := III III I h : hi 
,461 NSSRVTVTPDGTLIIRNISRSDEGKYTCFAENFMGKANSTGILSVRDATKHLAPSSADI 520 



Oy 532 ■ 



f 



Db 



-EKPG" 

:||| 



■ 535 



Db 521 NLGDNLTLQCHASHDPTMDLTFTWTLDDFPIDFDKPGGHYRRTNVKETIGDLTILNAQLR 580 
Oy 536 ■ 



536 STSLHRAADPSTY PAPPGTPKVLNVSRTSISLRWAKSQERPGAVGP 581 

I : h :| I III I hi II::: I 

581 HGGRYTCMAQTWDSASKEATVLVRGPPGPPGGVWRDIGDTTIQLSWSRGFDNH— SP 637 

582 IIGYTVEYFSP DLQTGWIVAAHRVGDTQVT - 1 SGLT PGT S YVFLVRAENTQG I S 634 

I II" :\ ::l h h : : III I I I I I I 

638 IAKYTLQARTPPAGKWKQVRTN- - ■ PANIEGNAETAQVLGLTPWMDYEFRVIASNILGTG 694 

635 VPSGLSNVIKTIEADFDAASANDLSAARTLLTGRSVELIDASAINASAYRLEWMLHVSAD 694 

III I: hi II : : : II I III :| : : h : I 
695 EPSGPSSKIRTREA- APSVAPSGLSGG- - - -GGAPGELI - - - -VNWTPMSREYQ- ■ -NGD 742 



Oy 695 E - KYVEGLR IHYKDASVPSAQYHS ITVMDASAESFWGN- ■ LKKYTKYEFFLTPFF 747 

h I h: I II I I: II 'I :/: II :| : : 

Db 743 GFGYLLSFRRQGSTHWQTARVPG ADAQYFVYSNESVRPYTPFEVKI RSY ■ 791 

Qy 748 ETIEGQPSNSKTALTY--EDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYKI 805 

I I Ml I h I I : I I I I II I Ihl 

Db 792 -NRRGDGPESLTALVYSAEEEPRVAPTKVWAKGVSSSEMNVTW-EPVQQDMNGILLGYEI 849 

Oy 806 EV-SAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPI -SLFMD 863 

lh : II I I I : :: :|| III : I 
Db 850 RYWKAGDKEAAADRVRTAGIiDTSARVSGLHPNTKYHVTVRAYNRAGTGPASPSANATTMK 909 

Oy 864 PTHHVHPPRAHPSG THDGRHEGQDLTYHNNGNI PPGDINP 903 

I III I | : ||: : : I 

Db 910 P PPRRPPGNISWTFSSSSLSIKWDPWPFRNESAVTGYKMLYQNDLH LTP 959 

Oy 904 TTHKKTTDYLSGP 916 

II ::: I 

960 TLHLTGKNWIEIP 972 



RESULT 8 
CAMLJiOUSE 
ID 
AC 
DT 
DI 
DT 
DE 
GN 



STANDARD; 



PRT; 1260 AA. 



CAMLJIOUSE 
P11627; 

01-OCT-1989 (Rel. 12, Created) 
01-OCT-1989 (Rel. 12, Last sequence update) 
01-OCT-2000 (Rel. 40, Last annotation update) 
NEURAL CELL ADHESION MOLECULE Ll PRECURSOR (N'CAM LI). 
LlCAM OR CAML1. 
Mus musculus (Mouse). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
[1] 

SEQUENCE FROM N.A., AND PARTIAL SEQUENCE. 
TISSUE=BRAIN; 

MEDLINE-88318924; PubMed-3412448; 

Moos M, , Tacke R., Scherer H., Teplow D., Frueh R, , Schachner M. ; 
"Neural adhesion molecule Ll as a member of the immunoglobulin 
superfamily with binding domains similar to fibronectin."; 
Nature 334:701-703(1988). 

-!• FUNCTION: CELL ADHESION MOLECULE WITH AN IMPORTANT ROLE IN THE 



CC DEVELOPMENT' OF THE NERVOUS SYSTEM, INVOLVED IN NEURON -NEURON 

CC ADHESION, NEURITS FASCICULATION, OUTGROWTH OF NEURITES, ETC. BINDS 

CC TO AXONIN ON NEURONS, 

CC -I- SUBCELLULAR' LOCATION : TYPE I MEMBRANE PROTEIN. 

CC -I- ALTERNATIVE 'PRODUCTS: TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 

CC PRODUCED BY. DIFFERENTIAL SPLICING (BY SIMILARITY). 

CC -I- SIMILARITY : ; CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS, 

CC -I- SIMILARITY: :CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS, 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation • 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 

CC 

DR EMBL; X12875; CAA31368.1; -. 

DR PIR; S05479; S05479. 

DR HSSP; P20241; 1CFB. 

DR MGD; MGI:96721; LlCAM. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPROO3O06; -. 

DR PFAM; PF00041; fn3; 4. 

DR PFAM; PF00047; ig; 6. 

DR PRINTS; PR00014; FNTYPEIII. 

KW Cell adhesion; Glycoprotein; Transmembrane; Repeat; Brain; 
Immunoglobulin domain; Signal; Alternative splicing, 



FT 


SIGNAL 


1 


19 






FT 


CHAIN 


• .20.' 


1260 


NEURAL CELL ADHESION MOLECULE Ll. 


FT 


DOMAIN 


20' 


1123 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


1124 ; 


1146 


POTENTIAL. • 




FT 


DOMAIN 


1147' 


1260 


CYTOPLASMIC (POTENTIAL 




FT 


DOMAIN 


50 


120 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


150, 


215 


IG-LIRE C2-TYPE DOMAIN 




FT 


DOMAIN 


256' 


318 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


346' 


410 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


440 


503 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


531\ 


599 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


827 


896 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


932' 


994 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


1032 


1094 


FIBRONECTIN TYPE- III. 




FT 


SITE 


553 


555 


CELL ATTACHMENT SITE (POTENTIAL). 


FT 


SITE 


562' 


564 


CELL ATTACHMENT SITE (POTENTIAL). 


FT 


CARBOHYD 


100' 


100 


N-LINKED (GLCNAC. . ,) 


(POTENTIAL). 


FT 


CARBOHYD 


202 - 


202 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


246 , 


246 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


293 


293 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


432 


432 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


478^ 


478 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


489 ' 


489 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


504 


504 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


587 : 


587 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


670' 


670 


N-LINKED (GLCNAC, . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


725 


725 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


776'"' 


776 


N-LINKED (GLCNAC. . .) 


(POTENTIAL), 


FT 


CARBOHYD 


824 


824 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


643 


848 


N-LINKED (GLCNAC, . ,) 


(POTENTIAL) . 


FT 


CARBOHYD 


875' 


875 


N-LINKED (GLCNAC, . .) 


(POTENTIAL). 


FT 


CARBOHYD 


968 


968 


N-LINKED (GLCNAC, . .) 


(POTENTIAL), 


FT 


CARBOHYD 


978. 


978 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


1022' 


1022 


N-LINKED (GLCNAC, . ,) 


(POTENTIAL) , 


FT 


CARBOHYD 


1030 


1030 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


1073' 


1073 


N-LINKED (GLCNAC, . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


1107. 


1107 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


VARSPLIC 


1180" 


1183 


MISSING (IN SHORT ISOFC 


RM) 


FT 








(BY SIMILARITY). 




SQ 


SEQUENCE 


1260 -AA; 140968 MW; 22BE57001CB2A538 CRC64; 



Query Match 8.64; Score 641.5; DB 1; Length 1260; 

Best Local Similarity 23.0%; Pred. No. 7.4e-25; 

Matches 275; Conservative 167; Mismatches 458; Indels 295; Gaps 48; 



Best Available Copy 
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Qy 35 WLLLVLVASNGLPAVRGQYOSPRI IE HPTDLWRKNEPATINCRVEGRPEPTI 87 

Ml I : I : :|: ::| I III : :| |: |:|: 

Db 9 KPLL-LCSPCLLIQIPDEYKGHHVLEPPVITEQSPRRLWFPTDDISLKCEARGRPQVEF 67 

Qy 88 EWFKDGEPVSTNEKKS — HRVQFKDGALFFYRTMQGKK — EQDGGEYWCVAKNRVGQA 141 

I III I: 1:1: |::l :: I I I I |::| I 

Db 68 RWTKDGIHFRPREELGVWHEAPY-SGSF — TIEGNNSFAQRFQGIYRCYASNRLGTA 122 

Qy 142 VSRHASLQIAVLRDDFRVEPKDT— -RVARGETALLECGPPRGIPEPTLIWIKDGV— 194 

:| M ■'. : ll:| I :||: :| I II I : |: : 
Db 123 MSH — EIQLVAEGAPKWPKET VKPVEVEEGESWLPCNPPPSAAPPRI YWMNSRIFDI 178 

Qy 195 PLD DLK 200 

1:1 I I 

Db 179 KQDERVSMGQNGDLYFANVLTSDNHSDYICNAHFPGTRTIIQREPIDLRVRPTNSMIDRK 238 

1201 ■ - -AMSFGASSRVRIVDGGNLLIS 221 
MM: : I :|:: 
239 PRLLFPTNSSSRLVALQGQSLILECIAEGFPTPTIRWLHPSDPMPTDRVIYQNHNRTLQL 298 

Qy 222 -NVEPIDEGNYRCIAQNLVGIRESSYAKLIVQVKPYFMKEPRDQVMLYGQTAIFHCSVGG 280 

II hi I |:|:| :|: :| : h ||::::|: : |:|| I I I 

Db 299 LNVGEEDDGEYTCLAENSLGSARHAY-YVTVEAAPYWLQRPQSHLYGPGETARLDCQVQG 357 

Qy 281 DPPPRVLWRREEGNIPVSRARILHDER SLE I SNIT PTDEGT YVC EAHKNVGQI S 334 

I I- h :l : hi II :lh III III I I : 
Db 358 RPQPEITWRIN — GMSMETVNRDQRYRIEQGSLILSNVQPTDTMVTQCEARNQHGLLL 413 

Qy 335 ARASL- IVHAPPNFTKRPSNRKVGLNG - WQLPCMASGNPPPSVFWTREGVSTLMFPNSS 392 

I I : :| I : : : : I I I II I III I I :|:: 

Db 414 ANAYIYWQLPARILTRDNQTYMAVEGSTAYLLCRAFGAPVPSVQWLDEEGTTVL — Q 469 

Qy 393 HGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPA 452 

I : hill I I:: I I I I I ::: I: III : III 
Db 470 DERFFPYANGTLSIRDLQANDTGRYFCQAANDQNNVTILANLQVREATQ ITQGPR 524 

Qy 453 NQTLPKGSVATLPCRATGNPS-PRIRWFHDGHAVQA— GNRYSIIQGSSLRVDDLQLS 507 

: lh I hi: :|| I I II :| ::| h \ I : I I 
Db 525 SAIEKKGARVTFTCQASFDPSLQASITWRGDGRDLQERGDSDRY-FIEDGRLVIQSLDYS 583 

Qy 508 DSGTYTCTASGERGET - SWAATLTVERPGSTSLHRAADPSTYPAPPGTPKVLNVSRTSIS 566 

II hi II I I I I I I II | :| ■ : :: : 
Db 584 DQGNYSCVASTELDEVESRAQLLVVGSPGPyPHLELSDRHL LRQSQVH 631 

567 LRWARSQERPGAVGPIIGYTVEYFSPDL-QTGWIVAAHRVGDTQVTISGLTPGTSYVFLV 625 



I I: 



II I :h 



I: I hi I I I 



P> '632 LSWSPAEDHN- ■ -SPIERYDIEFEDKEMAPEKWFSLGKVPGNQTSTTLKLSPYVHYTFRV 688 

Qy 626 RAENTQGISVPSGLSNVIKTIEADFDAASANDLSAA RTLLTGRSVELIDASAI 678 

I I I II :| : I II I I : ::| I : :| :| 

Db 689 TAINRYGPGEPSPVSESWTPEA— APERNPVDVRGEGNETNNMVITWKPLRWMDWNAP 745 

Qy 679 NASAVRLEWMLHVSADERYVEGLRIHYRDASVPSAQYHSITVMDASAESFWGNLRKYTR 738 

h:| I : : I II I : 

Db 746 QIQ-YRVQWR PQGRQETWRKQTVSDPFLWSNTSTFVP 782 

Qy 739 YEFFLTPFFETIEGQPSNSRTALTY EDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQ 794 

II : : : I : :| II I h I ::| : III I 
Db 783 YEIKV QAVNNQGRGPEPQVTIGYSGEDYPQVSPELEDITIFNSSTVLVRWRPVDLA 838 

Qy 795 HHNGNLYGYKI EVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNS 845 

hi II : : | : | ::: : | HI :|: | : | | : ; 

Db 839 QVKGHLRGYNVTYWWRGSQRRHSKRHIHR--SHIWPANTTSAILSGLRPYSSYHVEVQA 896 



Qy 846 FTRAGDGP YSRPISLFMDPTH-- 

I I II :| I : I I 



'•HVHPPRAHPSGTHDG— E 
I II :| :| I 



Db 


897 FNGRGLGPASEWTFSTPEGV — PGHPEALHLECQSDTSLLLHWQPPLSH-NGVLTGYLL 952 


FT 


SIGNAL 


r 


25 


POTENTIAL, 








FT 


CHAIN 


26' 


1447 


TUMOR SUPPRES 


30R PROTEIN DCC, LONG 


Qy 


882 -RH--EGQ DLTYfiNNGNIPPGDINPTTHKKTTDYLSGPWLMVLVCIVL 926 


FT 








ISOFORM. 






1 lh :l II I: 1 h : 1 II :: 


FT 


CHAIN 


85 


1447 


TUMOR SUPPRES 


SOR PROTEIN DCC, SHORT 


Db 


953 SYHPVEGESKEQLFFNLSDPELRTHNLTNLNP-DLQYRFQLQATTQQGGPGEAIVREGGT 1011 


FT 








ISOFORM. 








FT 


INITJET 


85.' 


85 


FOR SHORT ISOFORM. 



Qy 927 LVLV .-ISA AISMVYFR RKHQMTKELGHLSWSDNEITALNIN 967 

I III :l I I Ihlllh: :: 

Db 1012 MALFGKPDFGNISATAGENYSWSWVPRRGQCNFRFHILFRALPEGKVSPDHQPQPQYVS 1071 

Qy 968 SRESLWIDHHRGWR-TADTDKDSGLSESKLLSHVNSSQSNYNNSDGGTDYAEVDT 1021 

:| : 'I II : I : hi I ::| II II 
Db 1072 YNQSSYTQ — WNLQPDTKYEIHLIKERVLLHHLDVRTN GTGPVRVST 1116 



RESULT 9 
DCCJOUSE 

ID DCCJOUSE STANDARD; PRT; 1447 AA. 

AC P70211; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 30-MAY-2000 (Rel. 39, Last annotation update) 

DE TUMOR SUPPRESSOR PROTEIN DCC PRECURSOR. 

GN DCC. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BALB/C; TISSUE=BRAIN; 

RX MEDLINE-96112625; PubMed-8570174; 

RA Cooper H.M., Ames P., Britto J., Gad J,, Wilks A.F.; 

RT "Cloning of the mouse homologue of the deleted in colorectal cancer 

RT gene (mDCC) and its expression in the developing mouse embryo."; 

RL Oncogene 11:2243-2254(1995) . 

RN [2] 

RP REVISIONS . 

RC STRAIN-BALB/C; TISSUE-BRAIN; 

RA Cooper H.M.; 

RL Submitted (JUN-1996) to the EMBL/GenBank/DDBJ databases, 

CC -!- FUNCTION: IMPLICATED AS A TUMOR SUPPRESSOR GENE. 

CC -I- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

CC -I- ALTERNATIVE PRODUCTS: TWO FORMS OF THE PROTEIN ARE PRODUCED FROM 

CC THE SAME GENE BY THE USE OF ALTERNATIVE INITIATION SITES. A THIRD 

CC FORM WHICH IS EXPRESSED ONLY IN THE EMBRYO IS PRODUCED BY 

CC ALTERNATIVE SPLICING. 

CC -I- TISSUE SPECIFICITY: IN THE EMBRYO, EXPRESSED AT HIGH LEVELS IN THE 

CC DEVELOPING BRAIN AND NEURAL TUBE. IN ADULT, HIGHLY EXPRESSED IN 

CC BRAIN WITH VERY LOW LEVELS FOUND IN TESTIS, HEART AND THYMUS. 

CC -!- DEVELOPMENTAL STAGE: LOW LEVELS IN EARLY GESTATION. HIGHEST LEVELS 

CC EXPRESSED DURING MID GESTATION. LEVELS DECREASE IN LATE GESTATION 

CC AND REMAIN AT THIS LEVEL IN THE ADULT. 

CC -!- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC -!- SIMILARITY:' CONTAINS 6 FIBRONECTIN TYPE III-LIKB DOMAINS. 

CC - 

CC This SWISS-PROT, entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation • 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an emaii to license@isb-sib.ch) . 

CC 

DR EMBL; X85788; C&A59786.1; -. 

DR HSSP; P56276; 1TLR. 

DR MGD; MGI:94869; DCC. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 6. 

DR PFAM; PF00047; ig; 4. 

DR PRINTS; PR00014; FNTYPEIII. 

RW Glycoprotein'; Immunoglobulin domain; Transmembrane; Signal; 

RW Anti-oncogene; Alternative initiation; Alternative splicing. 
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FT 


DOMAIN 


26 


1097 


EXTRACELLULAR (POTENTIAL). 




FT 


TRANSMEM 


1098 


1122 


POTENTIAL, 


Db 


FT 


DOMAIN 


1123 


1447 


CYTOPLASMIC (POTENTIAL). 




FT 


DOMAIN 


54 


124 


16 -LIKE C2-TYPE DOMAIN. 


Qy 


FT 


DOMAIN 


154 


219 


I6-LIKE C2-TYPE DOMAIN. 






DOMAIN 


254 


317 


I6-LIKE C2-TYPE DOMAIN. 


Db 


FT 


DOMAIN 


345 


407 


IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


426 


522 


FIBRONECTIN TYPE-III. 


Qy 




DOMAIN 


525 


618 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


619 


716 


FIBRONECTIN TYPE-III. 


Db 


FT 


DOMAIN 


722 


816 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


840 


940 


FIBRONECTIN TYPE-III. 


oy 




DOMAIN 


941 


1042 


FIBRONECTIN TYPE'III. 


FT 


DISULFID 


61 


117 


BY SIMILARITY, 


Db 


FT 


DISULFID 


161 


212 


BY SIMILARITY. 




FT 


DISULFID 


261 


310 


BY SIMILARITY. 




FT 


DISULFID 


352 


400 


RY QTMTT.ARTTV 




FT 


CARBOHYD 


60 


60 


N-LINKED (6LCNAC, . .) (POTENTIAL) , 


Db 


FT 


CARBOHYD 


94 


94 


N-LINKED (6LCNAC. . .) (POTENTIAL). 




FT 


CARBOHYD 


299 


299 


N-LINKED (OLCNAC. , .) (POTENTIAL). 


Qy 


FT 


CARBOHYD 


318 


318 


N-LINKED (OLCNAC. . .) (POTENTIAL), 






CARBOHYD 


478 


478 


N-LINKED (6LCNAC. . .) (POTENTIAL) , 


Db 


t 


CARBOHYD 


628 


628 


N-LINKED (6LCNAC. , .) (POTENTIAL). 






CARBOHYD 


702 


702 


N-LINKED (GLCNAC, , .) (POTENTIAL). 


Qy 


FT 


VARSPLIC 


819 


838 


MISSING (IN EMBRYONIC ISOFORM). 




SQ 


SEQUENCE 


1447 AA; 158298 MW; 0D1F1O97C22D5B9F CRC64; 


Db 



Query Match 8.6%; Score 639.5; DB 1; Length 1447; 

Best Local Similarity 23.1*; Pred. No. l.le-24; 

Matches 272; Conservative 151; Mismatches 453; Indels 303; Gaps 43; 

Qy 53 YQSPRIIEHPTDLWKKNEPATLNCKVEG-KPEPTIEWFKDG--EPVSTNEKKSHRVQFK 109 

: I : hi I : III I : I hi III : :::l I 
Db 37 FTSLHFVSEPSDAVTMRGGNVLLNCSAESDRGVPVIKWKKDGLILALGMDDRKQ--QLP 93 

Qy 110 DGALFFYRTMQGKKEQ - DGGEYWCVAK - NRVGQAVSRHASLQI A- VLRDDFRVEPKDTRV 166 

:hl : : : M I I ! : 1 1 I : : I 1 1 I : : 
Db 94 NGSLLIQNILHSRHHKPDEGLYQCEASLADSGSIISRTAKVTVAGPLR-FLSQTESITA 151 

Qy 167 AKGETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDG6NLLISNVEPI 226 

hi Ihl I I lh I I: h I III :: I I II ::l 

Db 152 FMGDTVLLKC - EVIGEPMPT IHWQKNQQDLNPLP GDSRWVLPSGALQISRLQPG 205 

Qy 227 DEGNYKCIAQNLVGTRESSYAKLIVQVKP YFMKEPKDQVMLYGQTATFHCSVGG 280 

I I hi hi I : h: : I lh: I : : : h I III 
Db 206 DSGVYRCSARNPASIRTGNEAEVRILSDPGLHRQLYFLQRPSNVIAIEGKDAVLECCVSG 265 

Qy 281 DPPPKVLWKKEEGNIPV-SRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASL 339 
* III I : I I : h I :| llhl I III I III I I 

■d 266 YPPPSFTWLRGEEVIQLRSKKYSLLGGSELISNVTDDDSGTYTCWTYKNENISASAEL 325 

Qy 340 IVHAPPNFTKRPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVA 399 

I II I III : I : 
Db 326 TVLVPPWFLNHPSN LYAYESM 346 

Qy 400 ADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPKG 459 
• h I 

Db 347 DIEFEj- 351 

Qy 460 SVATLPCRAT6NPSPRIKWFHD6HAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGE 519 

I :l I I : I :| I : : h Ihlh : II I I I I I 
Db 352 CAVSGKPVPTVNWMKNGDWIPSDYFQIVGGSNLRILGWKSDEGFYQCVAENE 405 

Qy 520 RGETSWAATLTVEKPGSTSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAV 579 

I :l I I II I Ihl : II : I I II 
Db 406 AGNAQSSAQLIVPKPAIPS SSILPSAPRDVLPVLVSSRFVRLSWRPPAE— AK 456 

Qy 580 GPIIGYTVEYFSPD LQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQG 632 

I I :H :H : II : I hh II I I I I I I 
Db 457 GNIQTFTV-FFSREGDNRERALNT TQPGSLQLTVGNLKPEAMYTFRWAYNEWG 509 

Qy 633 ISVPSGLSNVIKTIEADFDAASANDLSAARTLLTGKSVELIDASAINASAVRLEWMLHVS 692 



I I !l 
■-PGESSQPIK-- 



:: I || : I :::::: I 
• -VATQPELQVPGPVENLHAVSTSPTSILITWEPPAY 553 



693 ADEKYVEGLRIHYKDASVPSAQYHSITVMDASAESFWGNLKKYTKYEFFLTPFFETIEG 752 

I: hi h : I I :: h : llhhl : I 
554 ANGP-VQGYRLFCTEVSTGKEQN IEVDGLSYKLEGLKKFTEYTLRFLAYNRYGPG 607 

753 QPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYRI---EVSA 809 

:: I : 1 1 || : : I : I I Mil I : llll : : 

608 VSTDDITWTLSDVPSAPPQNISLEWNSRSIKVSWLPPPSGTQNGFITGYKIRHRKTTR 667 

810 GNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISLFMDPTHHVH 869 

I: II II h II :::: I I II 
668 RGEME TLEPNNLWYLFTGLEKGSQYSFQVSAMTVNGTGP 706 

870 PPRAHPSGTHDGRHEGQDLT YHNNGNIP - ■ PGD ■ - I NPTTHKKTTDYLSGPWLMVLVC IV 925 

II :. II : :| I : I h lh 
707 PSNWYTAETPENDL- - -DESQVPDQPSSLHVRPQTN CII 742 

926 LLVLVISAAISMVYFKRKHQMTKELGHLSWSDNEITALNINS-KESLWIDHHRGWRTA 983 

: : I I : ::l I : I |:: :| : : : 

743 M t SWTPPL-NPNIWRGYIIGYGVGSPYAETVRVDSKQRYYSI 783 

984 DTDKDSGLSESKLLSHVNSSQSNYNNSDGGTD-YAEVDTRNLTTFYNCRKSPDNPTPYAT 1042 

: : I II I :lh I I lh:| I :l I 

784 ERLESS — ■ - • -SHYVISLKAFNNAGEGVPLYESATTRSIT DPTDPVDYYP 828 



Qy 1043 TMIIGTSSSETCTKTTSISADKDSGTHSPYSDAFAGQVPAVPWKSNYLQYPVEPI1WSE 1102 

: llll :| I h: I : ::|:: 

Db 829 LL DDFPTSGP- -DVSTPMLPPVG -VQAVALTHEAVRVSWAD 866 

Qy 1103 FLPPPPEHPPPSSTYGYAQGSPESSRKSSKSAGSGISTNQSILNASIHSSSSGGFSAWGV 1162 

I : " I : I: II ::| ::| h 

Db 867 NSVPKNQKTSDVRLYTVRWRTSFSASAKYKS EDTTSLSYTATGL 910 

Qy 1163 SP QYAVACPPENVYSNPLSAVAGGTQNRYQITPTN 1197 

I :::! :| h II I h lh 
Db 911 KPNTMYEFSVMV-TKNRRSSTWSMTAHAT--TYEAAPTS 946 



RESULT 10 
DCCJUMAN 

ID DCCJUMAN STANDARD; 

AC P43146; 

DT 01-NOV-1995 (Rel. 32, Created) 

DT 01-NOV-1995 (Rel. 32, Last sequence update) 

DT 15-JUL-1999 (Rel. 38, Last annotation update) 

DE TUMOR SUPPRESSOR PROTEIN DCC PRECURSOR (COLORECTAL CANCER SUPPRESSOR) . 

GN DCC. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metaiioa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-9501153;!; PubMed-7926722; 

RA Hedrick L., Cho K.R., Fearon E.R., Wu T.-C, Kinzler K.W., 

RA Vogelstein B,; ,' 

RT "The DCC gene product in cellular differentiation and colorectal 

RT tumorigenesis . ' ; 

RL Genes Dev. 8:1174-1183(1994). 

RN [2] 

RP SEQUENCE OF 1-750 FROM N.A. 

RX MEDLINE-90100559; PubMed-2294591; 

RA Fearon E.R., Cho K.R., Nigro J.M., Kern S.E., Simons J.W., 

RA Ruppert J.M., Hamilton S.R., Preisinger A.C., Thomas G., Kinzler K.W., 

RA Vogelstein B.; 

RT "Identification of a chromosome 18q gene that is altered in 

RT colorectal cancfcrs."; 

RL Science 247:49-56(1990). 

RN [3] 

RP SEQUENCE OF 107-472 FROM N.A. (SCRAMBELD EXONS). 

RX MEDLINE-91121517; PubMed-1991322; 

RA Nigro J.M., Cho.K.R., Fearon E.R, , Kern S.E,, Ruppert J.M., 



PRT; 1447 AA. 
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Oliner J.D., Kinzler K.W., Vogelstein B.; 
"Scrambled exons."; 
Cell 64:607-613(1991). 
[4] 

GENE STRUCTURE, AND VARIANTS CARCINOMA HIS-1375. 

MEDLINE-9424S241; PubMed-8188295; 

Cho R.R., Oliner J.D., Simons J.W., Hedrick L., Fearon E.R., 

Preisinger A.C., Hedge P., Silverman G.A., Vogelstein B.; 

"The DCC gene: structural analysis and mutations in colorectal 

carcinomas."; 

Genomics 19:525-531(1994). 
[5] 

VARIANT CARCINOMA THR-168, AND VARIANT GLY-201. 

MEDLINE-94243823; PubMed-8187090; 

Miyake S., Nagai*R,, roshino K., Oto M., Endo M,, Yuasa Y.; 

"Point mutations and allelic deletion of tumor suppressor gene DCC in 

human esophageal squamous cell carcinomas and their relation to 

metastasis."; 

Cancer Res. 54:3007-3010(1994). 

-!• FUNCTION: IMPLICATED AS A TUMOR SUPPRESSOR GENE. 

-!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

-!- TISSUE SPECIFICITY: FOUND IN AXONS OF THE CENTRAL AND PERIPHERAL 
NERVOUS SYSTEM AND IN DIFFERENTIATED CELL TYPES OF THE INTESTINE. 

-!• DISEASE: COLORECTAL TUMORS THAT LOST THEIR CAPACITY TO 
DIFFERENTIATE INTO MUCUS PRODUCING CELLS UNIFORMLY LACK DCC 
EXPRESSION. INACTIVATION OF DCC DUE TO ALLELIC DELETION AND/OR 
POINT MUTATIONS MAY CAUSE BOTH LYMPHATIC AND HEMATOGENOUS 
METASTASIS OF OESOPHAGEAL SQUAMOUS CELL CARCINOMAS, 

-!- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

-!- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE HI-LIKE DOMAINS. 

This SWISS -PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch) . 

EMBL; X76132; CAA53735.1; -. 
EMBL; M32292; AAA35751.1; -. 
EMBL; M32286; AAA52174.1; -. 
EMBL; M32288; AAA52175.1; ALT.SEQ. 
EMBL; M32290; AAA52176.1; -. 
EMBL; M63696; AAA52177.1; -. 
EMBL; M63700; AAA52178.1; -. 
EMBL; M63702; AAAS2179.1; -. 
EMBL; M63718; AAA52180.1; -. 
EMBL; M63698; AAA52181.1; -. 
PIR; A54100; A54100. 
PIR; A40098; A40098. 
PIR; A38442; A38442. 
HSSP; P56276; 1TLK. 
MIM; 120470; -. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; 
PFAM; PF00041; fn3; 6. 
PFAM; PF00047; ig; 4. 
PRINTS; PR00014; FNTYPEIII. 

Glycoprotein; Immunoglobulin domain; Transmembrane; Signal; 
Anti-oncogene; Disease mutation; Polymorphism. 



FT 


SIGNAL 


1 


25 


POTENTIAL. 


FT 


CHAIN 


26 


1447 


TUMOR SUPPRESSOR PROTEIN DCC. 


FT 


DOMAIN 


26 


1097 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1098 


1122 


POTENTIAL. 


FT 


DOMAIN 


1123 


1447 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


54 


124 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


154 


219 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


254 


317 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


345 


407 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


426 


522 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


525 


618 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


619 


716 


FIBRONECTIN TYPE- III. 



FT 


DOMAIN 


722 


816 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


840 


940 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


941/ 


1042 


FIBRONECTIN TYPE-III, 


FT 


DISULFID 


61 


117 


BY SIMILARITY. 


FT 


DISULFID 


161: 


212 


BY SIMILARITY, 


FT 


DISULFID 


261 


310 


BY SIMILARITY. 


FT 


DISULFID 


352: 


400 


BY SIMILARITY. 


FT 


CARBOHYD 


94 . 


94 


N-LINKED (GLCNAC, , .) (POTENTIAL). 


FT 


CARBOHYD 


299 


299 


N-LINKED (GLCNAC. . ,) (POTENTIAL). 


FT 


CARBOHYD 


318. 


318 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CARBOHYD 


478, 


478 


N-LINKED (GLCNAC, . .) (POTENTIAL), 


FT 


CARBOHYD 


628' 


628 


N-LINKED (GLCNAC, . .) (POTENTIAL), 


FT 


CARBOHYD 


702; 


702 


N-LINKED (GLCNAC. , .) (POTENTIAL). 


FT 


VARIANT 


168> 


168 


M -> T (IN OESOPHAGEAL CARCINOMA) , 


FT 








/FTId=VAR_003909, 


FT 


VARIANT 


201- 


201 


R -> G. 


FT 








/FTId-VAR_003910. 


FT 


VARIANT 


1375 


1375 


P -> H (IN A COLORECTAL CARCINOMA). 


FT 








/FTId-VAR_003911. 


FT 


CONFLICT 


138- 


138 


MISSING (IN REF. 3). 


FT 


CONFLICT 


233 


329 


MISSING (IN REF, 3). 


FT 


CONFLICT 


42i; 


421 


MISSING (IN REF, 3). 


SQ 


SEQUENCE 


1447, AA; 158456 MW; 4A8612766ED0471F CRC64; 



Query Match 8.64; Score 637.5; DB 1; Length 1447; 

Best Local Similarity 22,94; Pred. No. 1.4e-24; 

Matches 271; Conservative 150; Mismatches 444; Indels 317; Gaps 43; 

Qy 57 RI IEHPTDLVVKKNEPATLNCKVEG - KPEPT IEWFKDG - - EPVSTNEKKSHRVQFKDGAL 113 

I : hi I' : hi I : I hi- III : :|:| I :|:| 
Db 41 RFLSEPSDAVTMRGGNVLLDCSAESDRGVPVIKWKKDGIHLALGMDERKQ- - -QLSNGSL 97 

Qy 114 FFYRTMQGKKEQ-DGGEYWCVAK-NRVGQAVSRHASLQIA-VLRDDFRVEPKDTRVAKGE 170 

: : I I I I I I :IM : :| II I : : |: 
Db 98 LIQNILHSRHHKPDEGLYQCEASLGDSGSIISRTAKVAVAGPLR--FLSQTESVTAFMGD 155 

Qy 171 TALLECGPPK3IPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGN 230 

I IN 'I I II: I I: I : III Mil ::| I I 

Db 156 TVLLKC-EVIGEPMPTIHWQKNQQDLTPIP GDSRVWLPSGALQISRLQPGDIGI 209 

Qy 231 YKCIAQNLVGTRESSYAKLIVQVKP YFMKEPKDQVMLYGQTATFHCSVGGDPPP 284 

M |:| :| : |:: : I lh: I : I : |: I I I I III 
Db 210 YRCSARNPAS3RTGNEAEVRILSDPGLHRQLYFLQRPSNWAIEGKDAVLECCVSGYPPP 269 

Qy 285 KVLWKKEEGNIPV-SRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHA 343 

I : I I : I: I :| IN I I I I III II I 

Db 270 SFTWLRGEEVIQLRSKKYSLLGGSNLLISNVTDDDSGMYTCWTYKNENISASAELTVLV 329 

Qy 344 PPNFTKRPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGT 403 

II I III : I : 

Db 330 PPWFLNHPSN LYAYESM 346 

Qy 404 LQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVAT 463 
I: I 

Db 347 -— DIEFE-j- 351 

Qy 464 LPCRATGNPSPRIKWFHDGHAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGET 523 

I :l I -I : I :| I : : |: Ihll: : II I I I I II 
Db 352 --CTVSGKPVPTVNWMKNGDWIPSDYFQIVGGSNLRILGWKSDEGFYQCVAENEAGNA 409 

Qy 524 SWAATLTVEKPGSTSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPII 583 

:| I I II I I I: I : II : I I I I II 
Db 410 QTSAQLIVPKPAIPS SSVLPSAPRDWPVLVSSRFYRLSWRPPAE— AKGNIQ 460 

Qy 584 GYTVEYFSPD- LQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVP 536 

' :|| :|| v II : I |:|: II I I I I I I I 
Db 461 TFTV - FFSREGDNRERALNT TQPGSLQLTVGNLKPEAMYTFRWAYNEWG--P 510 

Qy 637 SGLSNVIKTIEADFDAASANDLSAARTLLTGKSVELIDASAINASAVRLEWMLHVSADEK 696 

■III' :: I || : I :::::: I |: 

Db 511 GESSQPIK-- VATQPELQVPGPVENLQAVSTSPTSILITWEPPAYANGP 557 
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Qy 697 YVEGLRIHYKDASVPSAQYHSITVMDASAESFWGNLKKYTKYEFFLTPFFETIEGQPSN 756 

1:1 I: : I I |: : : I :: 

Db 558 -VQGYRLFCTEVSTGKEQN IEVDGLSYKLEGLKKFTEYSLRFLAYNEYGPGVSTD 611 

Qy 757 SKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYKI---EVSAGNTM 813 

I :l MUM! I: : : I : II Mil II : MM : : I 
Db 612 DITWTLSDVPSAPPQNVSLEWNSRSIKVS,WLPPPSGTQNGFITGYRIRHRKTTRRGEM 671 

Qy 814 KVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISLFMDPTHHVHPPRA 873 

: II I I I: II :::: I I II 
Db 672 E TLEPNNLWYLFTGLEKGSQYSFQVSAMTVNGTGP 706 

Qy 874 HPSGTHDGRHEGQDLTYHNNGNIP- -PGD- - INPTT HKKTTDYLSGPWLMVLVC IVLLVL 929 

II : || : :| I : |: ||:: 
Db 707 - PS NWYT AETPENDL — DESQVPDQPSSLHVRPQT N CUM— 743 

Qy 930 VI S AAI SMVYFKRKHQMT KELGHLSWSDNEIT ALNINS - - KESLWIDHHRGWRTADTDK 987 

I I : ::| I : I |:: :| : : : : : 
Db 744 SWTPPL-NPNIWRGYIIGYGVGSPYAETVRVDSKQRYYSIERLE 787 

.Qy *988 DSGLSESKLLSHVNSSQSNYNNSDGGTD-YAEVDTRNLTTFYNCRKSPDNPTPYATTHII 1046 

• I II* I :M: I I H::l I :l I 
788 SS SHYVISLKAFNNAGEGVPLYESATTRSIT DPTDPVDY 826 

Qy 1047 GTSSSETCTKTTSISADKDSGTHSPYSDAFAGQVPAV PV-VKSNYLQYPVEPIN 1099 

Mill: II I:: I : :: 
Db 827 YPLLDDFPTSVPDLSTPMLPPVGVQAVALTHDAVRVS 863 

Qy 1100 WSEFLPPPPEHPPPSSTYGYAQGSPESSRRSSKSAGSGISTNQSILNASIHSSSSGGFSA 1159 

I:: I : I : |: II ::| ::| 

Db 864 WADNSVPKNQKTSEVRLYTVRWRTSFSASAKYKS EDTTSLSYTA 907 

Qy 1160 WGVSP — QYAVACPPENVYSNPLSAVAGGTQNRYQIT PTN 1197 

I: I :::| :| |: I I I |: ||: 
Db 908 TGLKPNTMYEFSVMV-TKNRRSSTWSMTAHAT-TYEAAPTS 946 



RESULT 11 
CAMLJOMAN 



STANDARD; 



PRT; 1257 AA. 



CAMLJOMAN 
P32004; 

01-JUL-1993 (Rel. 26, Created) 
01-OCM996 (Rel. 34, Last sequence update) 
01-OCT-2000 (Rel. 40, Last annotation update) 

NEURAL CELL ADHESION MOLECULE LI PRECURSOR (N-CAM Ll). 
L1CAM OR CAML1 OR MIC5. 
Homo sapiens (Human). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
[1] 

SEQUENCE FROM N.A. 

MEDLIM031698; PubMed-1932117; 

Kobayashi M., Miura M., Asou H,, Uyemura K.; 

"Molecular cloning of cell adhesion molecule Ll from human nervous 

tissue: a comparison of the primary sequences of Ll molecules of 

different origin."; 

Biochim, Biophys. Acta 1090:238-240(1991). 
[2] 

SEQUENCE FROM N.A, 

Rosenthal A,, Coutelle O., Drescher B.; 

Submitted (APR-1994) to the EMBL/GenBank/DDBJ databases. 

[3] 

SEQUENCE FROM N.A. 
MEDLINE-92329299; PubMed-1627459; 
Reid R.A., Hemperly J.J.; 

"Variants of human Ll cell adhesion molecule arise through alternate 

splicing of RNA . " ; 

J. Mol. Neurosci. 3:127-135(1992), 

[4] 

SEQUENCE OF 353-1176 FROM N.A, 

MEDLINE=92020233 ; PubMed-1923824 ; 

Rosenthal A., Mackinnon R.N., Jones D.S.C.; 

"PCR walking from microdissection clone M54 identifies three exons 



RT from the human gene for the neural cell adhesion molecule Ll 

RT (CAM-Ll)."; >' 

RL Nucleic Acids Res. 19:5395-5401(1991). 

RN [5] 

RP SEQUENCE OF 332-371 FROM N.A. 

RX MEDLINE-90353957; PubMed-2387585; 

RA Djabali M., Mattel M.-G., Nguyen C, Roux D., Demengeot J., 

RA Denizot F., Moos M., Schachner M., Goridis C, Jordan B.R.; 

rt "The .gene encoding Ll, a neural adhesion molecule of the 

RT immunoglobulin family, is located on the X chromosome in mouse and 

RT man."; 

RL Genomics 7:587-593(1990). 

RN [6] ; . 

RP SEQUENCE OF 1030-1257 FROM N.A. 

RX MEDLINE-91132183; PubMed-1993895; 

RA i Harper J.R., Prince J.T., Healy P. A., Stuart J.K., Nauman S.J., 

RA Stallcup W.B.;. ; 

RT . "Isolation and sequence of partial cDNA clones of human Ll: homology 

RT of human and rodent Ll in the cytoplasmic region."; 

RL J. Neurochem. 56:797-804(1991), 

RN [7] : : 

RP SEQUENCE OF 20-36. 

RX MEDLINE-88298876; PubMed-3136168; 

RA ■ Wolff J.M., Frank R., Mujoo K., spiro R.C, Reisfeld R.A., 

RA ! Rathjen F.G.; 

RT "A human brain glycoprotein related to the mouse cell adhesion 

RT molecule LL"; '■ 

RL J. Biol. Chem, 263:11943-11947(1988). 

RN [8] 

RP VARIANT HSAS TYR-264. 

RX MEDLINE=94004956; PubMed-8401576; 

RA Jouet M., Rosenthal A., Macfarlane J., Kenwrick S., Donnai D.; 

RT "A missense mutation confirms the Ll defect in X-linked hydrocephalus 

RT (HSAS)."; 

RL Nat. Genet. 4:331-331(1993). 

rn [9] • *: 

rp variant hsas/masa leu- 1194. 

RX MEDLINE=95187172; PubMed-7881431;' 

RA Fransen E., Schrander-Stumpel C, Vits L,, Coucke P., van Camp G., 

RA Willems P.J.; ; ■ 

RT "x-linked hydrocephalus and masa syndrome present in one family are 

RT due to a single missense mutation in exon 28 of the L1CAM gene."; 

RL Hum. Mol. Genet, 3:2255-2256(1994). 

RN [10] • 

RP VARIANTS HSAS GLN'184 AND ARG-452, AND VARIANT MASA GLN-210. 

RX MEDLINE-95004608; PubMed-7920659; 

RA Jouet M., Rosenthal A., Armstrong G. ( Macfarlane J., Stevenson R., 

RA Paterson J., Metzenberg A., Ionasescu V,, Temple K,, Kenwrick S,; 

RT "X-linked spastic paraplegia (SPGl), MASA syndrome and X-linked 

RT hydrocephalus result from mutations in the Ll gene."; 

RL Nat. Genet. 7:402-407(1994), 

RN [11] 

RP VARIANTS MASA GLN-210 AND ASN-598. 

RX MEDLINE-95004609; PubMed-7920660; 

RA Vits L., van Camp G., Coucke P., Fransen E., de Boulle K., 

RA Reyniers E,, Kbrn B,, Poustka A., Wilson G., Schrander-Stumpel C, 

RA Winter R,M,, Schwartz C, Willems P.J.; 

RT "MASA syndrome, is due to mutations in the neural cell adhesion gene 

RT L1CAM . " ; 

RL Nat. Genet. 7:408-413(1994). 

RN [12] 

RP VARIANTS HSAS/MASA S-9; S-121; K-309; F-768; L-941 AND C-1070. 

RX MEDLINE-95282776; PubMed=7762552; 

RA Jouet M., Monc'k A., Paterson J., McKeown C, Fryer A, , Carpenter N. , 

RA Holmberg E., Wadelius C, Kenwrick S.; 

RT "New domains of neural cell-adhesion molecule Ll implicated in 

RT X-linked hydrocephalus and MASA syndrome."; 

RL Am. J. Hum. Genet. 56:1304-1314(1995). 

RN [13] • • 

RP VARIANTS HSAS/MASA Q-184; Q-210; Y-264; R-452; N-598 AND L-1194. 

RX MEDLINE-96153146"; PubMed-8556302; 

RA Fransen E,, Lemmon V., van Camp G., Vits L., Coucke P., Willems P.J.; 

RT "CRASH syndrome: clinical spectrum of corpus callosum hypoplasia, 
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RT retardation, adducted thumbs, spastic paraparesis and hydrocephalus 

RT due to mutations in one single gene, LI."; 

RL Eur. J. Hum. Genet, 3:273-284(1995). 

RN [14] 

RP ERRATUM. 

RA Fransen E., Lemmon V., van Camp G., Vits L,, Coucke P,, Willems P.J.; 

RL Eur. J. Hum. Genet. 4:126-126(1996). 

RN [15] 

RP VARIANTS HSAS/HASA/SPG1 SER-179 AND ARG-370. 

RX MEDLINE-96057511; PubMed-7562969; 

RA Ruiz J.c, Cuppens H., Legius E., Fryns J. -P., Glover T., Marynen P., 

RA Cassiman J. -J.; 

RT "Mutations in Ll-CAM in two families with X linked complicated 

RT spastic paraplegia, MASA syndrome, and HSAS."; 

RL J, Med, Genet. 32:549*552(1995). 

RN [16] 

RP VARIANTS HSAS CYS'194 AND LEU-240. 

•MEDLINE*97083370; PubMed-8929944; 

. Gu S.-M., Orth u., Veske A., Enders H,, Kluender K., Schloesser N,, 
Engel W., Schwinger E., Gal A.; 

RT "Five novel mutations in the LlCAM gene in families with X linked 

RT hydrocephalus."; 

RL J. Med. Genet. 33:103-106(1996). 

RN [17] 

RP VARIANTS HSAS Q-184; V-439--T-443 DEL; 0784 AND L-936--L-948 DEL. 

RX MEDLINE-97338664; PubMed-9195224 ; 

RA Macfarlane J.R., Du J.-S., Pepys M.E., Ramsden S., Donnai D., 

RA Charlton R, , Garrett C, Tolmie J., Yates J.R.W., Berry C, Goudie D., 

RA Moncla A., Lunt P., Hodgson S., Jouet M., Kenwrick S,; 

RT "Nine novel LI CAM mutations in families with x-linked 

RT hydrocephalus,"; 

RL Hum, Mutat. 9:512-518(1997), 

RN [18] 

RP VARIANTS HSAS/MASA ASP-691; ARG-698 AND PRO-935, 

-RX MEDLINE-98180721; PubMed-9521424; 

•RA DuY.-Z./ Srivastava A.R., Schwartz C.E.; 

''RT "Multiple exon screening using restriction endonuclease 

RT fingerprinting (REF) : detection of six novel mutations in the LI cell 

RT adhesion molecule (LlCAM) gene,"; 

RL Hum, Mutat. 11:222-230(1998), 

RN [19] 

RP VARIANT CRASH PRC-632. 

RX MEDLINE-98112489; PubMed-9452110; 

RA Vits L., Chitayat D., van Camp G., Holden J, J, A., Fransen E., 

RA willems P.J.; 

•"Evidence for somatic and germline mosaicism in CRASH syndrome."; 
Hum. Mutat. Suppl. 1:S284-S287(1998). 
[20] 

RP VARIANTS HSAS/MASA THR-219; ARG-335; CYS-386; CYS-473 AND LEU-1224. 

RX MEDLINE-98415726; PubMed-9744477; 

RA Saugier-veber P., Martin C, le Meur N., Lyonnet S., Munnich A., 

RA David A., Henocq A., Heron D., Jonveaux P., Odent S., Manouvrier S., 

RA Moncla A,, Morichon N., Philip N,, Satge D., Tosi M., Frebourg T.; 

RT "Identification of novel LlCAM mutations using fluorescence-assisted 

RT mismatch analysis."; 

RL Hum. Mutat, 12:259-266(1998). 

CC H - FUNCTION: CELL ADHESION M0LECDLE WITH AN IMPORTANT ROLE IN THE 
CC DEVELOPMENT OJJ THE NERVOUS SYSTEM, INVOLVED IN NEURON-NEURON 
CC ADHESION/ NEURIT E FASCICULATION, OUTGROWTH OF NEURITES, ETC. BINDS 
CC TO AXONIN ON NEURONS, 

CC -I- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN, 

CC -I- ALTERNATIVE PRODUCTS: TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 

CC PRODUCED BY DIFFERENTIAL SPLICING. 

CC -!- DISEASE: DEFECTS IN LlCAM ARE THE CAUSE OF THREE X-LINKED 

CC SYNDROMES. 1: HYDROCEPHALUS OWING TO STENOSIS OF THE AQUEDUCT OF 

CC SYLVIUS (HSAS) CHARACTERIZED BY MENTAL RETARDATION AND ENLARGED 

CC BRAIN VENTRICLES. 2: MASA SYNDROME WHICH IS CHARACTERIZED BY 

CC MENTAL RETARDATION, APHASIA, SHUFFLING GAIT, AND ADDUCTED THUMBS. 

CC HAS AN OVERLAPPING PROFILE OF CLINICAL SIGNS WITH HSAS, BUT WITH A 

CC MILDER PRESENTATION AND A LONGER LIFE EXPECTANCY. 3: SPASTIC 

CC PARAPLEGIA TYPE 1 (SPG1). COLLECTIVELY THESE SYNDROMES ARE ALSO 

CC KNOWN AS CRASH SYNDROME, AN ACRONYM WHICH STANDS FOR CORPUS 

CC CALLOSUM HYPOPLASIA, PSYCHOMOTOR RETARDATION, ADDUCTED THUMBS, 



CC SPASTIC PARAPARESIS, AND HYDROCEPHALUS. 

CC "I- DISEASE: DEFECTS IN LlCAM ARE THE CAUSE OF HIRSCHPRUNG DISEASE 
CC (HSCR) . 

CC -I- SIMILARITY:, CONTAINS 6 IMMUNOGLOBULIN - LIKE C2-TYPE DOMAINS. 

CC -!- SIMILARITY:, CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC -!- DATABASE: NAME=L1CAM; NOTE-L1CAM mutation Web Page; 
CC WWW- "http : //hgins . uia . ac . be/dnalab/11 " . 

CC - 

CC This SWISS-PROT entry is copyright, It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

cc - 

DR EMBL; X59847; CAA42508.1; -. 

DR EMBL; Z29373; CAA82564.1; 

DR EMBL; M74387; AAA59476.1; -. 

DR EMBL; X58775; CAA41576.1; 

Query Match . 8.5*; Score 633.5; DB 1; Length 1257; 
Best Local Similarity 23.7%; Pred. No. 1.8e-24; 



Matches 


243; Conservative 142; Mismatches 395; Indels 245; Gaps 


Qy 


35 


WLLLVLVASN3LPAVRGQYQSPRI IE HPTDLWKKNEPATLNCKVEGKPEPTI 

Mil: i : :|: ::[ I III : :| |: Mil 


87 


Db 


9 


WPLL'LCSPCLLIQIPEEYEGHHVMEPPVITEQSPRRLWFPTDDISLKCEASGKPEVQF 


67 


Qy 


88 


EWFKDGEPVSTNEKKSHRVQFKDGALFFYRTMQGKK- - -EQDGGEYWCVAKNRVGQAVS 
Mil 1: 1 : 1 |: 1 :: 1 1 1 1 |::| |:l 
RWTRDGVHFKPKEELGVTVYQSPHSGSF--TITGNNSNFAQRFQGIYRCFASNKLGTAMS 


143 


Db 


68 


125 


Qy 


144 


RHASLQIAVLRDDFRVEPKDT RVAKGETALLECGPPKGIPEPTLIWIKDGVPLDDL 

:l :: • 1 1 : 1 1 :||: :| 1 II : |: 


199 


Db 


126 


174 


Qy 


200 


KAMSFGASSRVRIVDGGNLLISNV EPI 

1 : if-: III :|| III 
KILHIKQDERVTMGQNGNLYFANVLTSDNHSDYICHAHFPGTRTIIQKEPIDLRVKATNS 




Db 


175 


234 


Qy 


227 




226 




Db 


235 


MIDRKPRLLFPTNSSSHLVALQGQPLVLECIAEGFPTPTIKWLRPSGPMPADRVTYQNHN 


294 


Qy 


227 


DEGNYKCIAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFH 

■j:| l:|:|:| :|: :| : h II:: :|: : hll 
KTLQLLKVGEEDDGEYRCLAENSLGSARHAY-YVTVEAAPYWLHKPQSHLYGPGETARLD 


275 


Db 


295 


353 


Qy 


275 


CSVGGDPPPKVLWKKEEGNIPVSRARILHDEK SLEISNITPTDEGTYVCEAHNN 

MM hl'l: III : hi :| :||: |:| III 1 
CQVQGRPQPEVTWR- -INGIPVE- -ELAKDQKYRIQRGALILSNVQPSDTMVTQCEARNR 


329 


Db 


354 


409 


Qy 


330 


VGQISARASL-IVHAPPNFTKRPSNKKVGLNG-WQLPCMASGNPPPSVFWTKEGVSTLM 
MM:: 1 : : : | I I 1 1 1 III 1 | :|:: 
HGLLLANAYIYVVQLPAKILTADNQTYMAVQGSTAYLLCKAFGAPVPSVQWLDBDGTTVL 


387 


Db 


410 


469 


Qy 


388 


FPNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPII 
1 :'l:lll ||:: | | | 1 M :M: M : 1 
- - - -QDERFFPYANGTLGIRDLQANDTGRYFCLAANDQNNVTIMANLKVKDATQ I 


447 


Db 


470 


520 


Qy 


448 


QIGPANQTLPKGSVATLPCRATGNPS- -PRIKWFHDGHAVQA- - - GNRYS I IQG SSLRVD 

II : III 1 M :|| 1 II II : ::| |: | : 
TQGPRSTIEKSGSRVTFTCQASFDPSLOPSITWRGDGRDLQELGDSDKY-FIEDGRLVIH 


502 


Db 


521 


579 


Qy 


503 


DLQLSDSGTYTCTASGERGET-SWAATLTVEKPGSTSLHRAADPSTYPAPPGTPKVLN-V 
1 II 1 Mil 1 1 1 1 1 II II 1: : 
SLDYSDQGNY5CVASTELDWESRAQLLWGSPG PVPRLVLSDLHLL 


560 


Db 


580 


626 


Qy 


561 


SRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDL-QTGWIVAAHRVGDTQVTISGLTPGT 
::: : : M<::: || | :|: :: | |: | |:| 
TQSQVRVSWSPJVEDHN- - -APIEKYDIEFEDKEMAPEKWYSLGKVPGNQTSTTLKLSPYV 


619 


Db 


627 


683 
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Qy 620 SYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASANDL SAARTLLTGKSVEL 672 

I I I I I I II :| : I II I I : ::| I : 

Db 684 HYTFRVTAINKYGPGEPSPVSETWTPEA— APEKNPVDVKGEGNETTNMVITWKPLRW 740 

Qy 673 IDASAINASAVRLBWMLHVSADEKYYEGLRIHYKDASVPSAQYHSITVMDASAESPWGN 732 

:| :| l::l :l I I I II I 

Db 741 MDWNAPQVQ-YRVQWR PQGTRGPWQEQIV — ' SDPFLWSN 777 

Qy 733 LKKYTKYEFFLTPFFETIEGQPSNSKTALTY — EDVPSAPPDNIQIGMYNQTAGWVRW 788 

: II : : : I : :| 1 1 I I I : I : I : I I : I 
Db 778 TSTFVPYEIKV — QAVNSQGKGPEPQVTIGYSGEDYPQAIPELEGIEILNSSAVLVKW 833 

Qy 789 TPPPSQHHNGNLYGYKIEVSAGNTMKVLA NMTLNATTTSVLLNNLTTGAVYSV 841 

I |:| II : : : : :: : I |l||:|: I : I : 
Db 834 RPVDLAQVKGHLRGYNVTYWREGSQRKHSKRHIHKDHVVVPANTTSVILSGLRPYSSYHL 893 



Qy 842 RLNSFTKAGDGPYSKPISLFMDPTH-HVHPPRAH 

: :| I II I: I I III I Ml I 

Db 894 EVQAFNGRGSGPASE--FTFSTPEGVPGHPEALHLECQSNTSLLLRWQPPLSHNGVLTGY 951 

Oy 887 DLTYH 891 

m 

■ 952 VLSYH 956 



RESULT 12 
CAMLJAT 

ID CAMLJAT STANDARD; PRT; 1259 AA, 

AC Q05695; 

DT 01-FEB-1994 (Rel. 28, Created) 

DT 01-OCM994 (Rel. 30, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEURAL CELL ADHESION MOLECULE Ll PRECURSOR (N-CAM LI). 

GN L1CAM OR CAML1. 

OS Rattus norvegicus (Rat). 

OC Eukaryota; Metazoa; Chordata; Craniate; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

RN [1] 

RP SEQUENCE FROM N.A, 

RX MEDLINE=91372414; PubMed-1894011; 

RA Miura M., Kobayashi M., Asou H., Uyemura K.; 

RT "Molecular cloning of cDNA encoding the rat neural cell adhesion 

RT molecule Ll, Two Ll isoforms in the cytoplasmic region are produced 

RT by differential splicing."; 

RL FEBS Lett. 289:91-95(1991). 

CC -!- FUNCTION: CELL ADHESION MOLECULE WITH AN IMPORTANT ROLE IN THE 
CC DEVELOPMENT OF THE NERVOUS SYSTEM. INVOLVED IN NEURON-NEURON 
CC ADHESION, NEURITE FASCICULATION, OUTGROWTH OF NEURITES, ETC. BINDS 
CC TO AXONIN ON NEURONS. 

C , SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 
-!- ALTERNATIVE PRODUCTS: TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 
PRODUCED BY DIFFERENTIAL SPLICING. 

CC -!- TISSUE SPECIFICITY: THE SHORTER ISOFORM IS PREDOMINANTLY FOUND IN 
CC THE BRAIN, WHILE THE LONGER ISOFORM IS FOUND IN THE PERIPHERAL 
CC NERVOUS SYSTEM. 

CC -!- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinforaatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to llcenseSisb-sib.ch). 

cc 

DR EMBL; X59149; CAA41860.1; -. 

DR PIR; S17655; S17655. 

DR HSSP; P20241; 1CFB. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 4. 

DR PFAM; PF00047; ig; 6. 



DR 


PRINTS; PR00014; FNTYPEIII. 






KW 


Cell adhesion; Glycoprotein; 


Transmembrane; Repeat; 


Brain; 


KW 


Immunoglobulin domain; Signal; Alternative splicing. 




FT 


SIGNAL 


• , H 


19 


BY SIMILARITY. 




FT 


CHAIN 


20! 


1259 


NEURAL CELL ADHESION MOLECULE Ll, 


FT 


DOMAIN ' 


20: 


1122 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1123. 


1145 


POTENTIAL, 




FT 


DOMAIN 


1146' 


1259 


CYTOPLASMIC (POTENTIAL) 




FT 


DOMAIN 


50 •' 


120 


IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


150' 


215 


IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


256 


318 


IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


346 


410 


IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


440 


503 


IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


531' 


599 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


827. 


896 


FIBRONECTIN TYPE- III, 




FT 


DOMAIN 


932 


994 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


1032'- 


1093 


FIBRONECTIN TYPE- III, 




FT 


SITE 


553: 


555 


CELL ATTACHMENT SITE (POTENTIAL). 


FT 


SITE 


562 


564 


CELL ATTACHMENT SITE (POTENTIAL). 


FT 


CARBOHYD 


100 


100 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


202 


202 


N-LINKED (GLCNAC. . ,) 


(POTENTIAL). 


FT 


CARBOHYD 


246; 


246 


N-LINKED (GLCNAC, . ,) 


(POTENTIAL). 


FT 


CARBOHYD 


293 


293 


N-LINKED (GLCNAC, , .) 


(POTENTIAL), 


FT 


CARBOHYD 


432 


432 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


489. 


489 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


504. 


504 


N-LINKED (GLCNAC. . .) 


(POTENTIAL), 


FT 


CARBOHYD 


670 


670 


N-LINKED (GLCNAC. , .) 


(POTENTIAL) . 


FT 


CARBOHYD 


725' 


725 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


776 


776 


N-LINKED (GLCNAC, . .) 


(POTENTIAL). 


FT 


CARBOHYD 


824, 


824 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


846 


848 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


875, 


875 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


968 


968 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


978: 


978 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


1021 


1021 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


1029 


1029 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


1072 


1072 


N-LINKED (GLCNAC, . .) 


(POTENTIAL). 


FT 


CARBOHYD 


1106, 


1106 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


VARSPLIC 


1179! 


1182 


MISSING (IN SHORT ISOFORM) . 


SQ 


SEQUENCE 


1259 AA; 140934 MW; 19681B022D8F24AB CRC64; 



Query Match 8.4%; Score 627; DB 1; Length 1259; 

Best Local Similarity 23.0%; Pred. No. 3.9e-24; 

Matches 248; Conservative 147; Mismatches 394; Indels 288; Gaps 42; 

Qy 35 WLLLVLVASNG - -LPAVRGQYQSPRIIE HPTDLWKKNEPATLNCKVEGKPEP 85 

I :| |: :,' I : :|: ::| I III : :l 1= hh 

Db 6 WYVLPLLLCSPCLLIQIPDEYKGHHVLEPPVITEQSPRRLWFPTDDISLKCEARGRPQV 65 

Qy 86 TIEWFKDGEPVSTNEKKS--HRVQFKDGALFFYRTMQGKK— EQDGGEYWCVAKNRVG 139 

Mil l: I : I : |::| :: I I I I I :| 
Db 66 EFRWTKDGIHFKPKEELGVWHEAPY-SGSF TIEGNNSFAQRFQGIYRCYASNNLG 120 

Qy 140 QAVSRHASL QIAVLRD 155 

hi M :l 
Db 121 TAMSHEIQLVAEGAPKWPKETVKPVEVEEGESWLPCNPPPSAAPLRIYWMNSKILHIKQ 180 

Qy 156 DFRV : EPKDTRV 166 

Ml IN II 

Db 181 DERVSMGQNGDLYFANVLTSDNHSDYICNAHFPGTRTIIQKEPIDLRVKPTNSMIDRKPR 240 

Qy 167 AKGETALLECGPPKGIPEPTLIWI--KDGVPLDDLKAMSFGASSRVR 211 

. :|:: Ml :| I ||: |: MM I 
Db 241 LLFPTNSSSHLVALQGQSLILEC - IAEGFPTPT IKWLHPSDPMPTD R 286 

Qy 212 IV— DGGNLLISNVEPIDEGNYKCIAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLY 268 

:: Mil hi I hhl :h :l : h l|::::|: : 
Db 287 VI YQNHNKTLQLLNVGEEDDGEYTCLAENSLGSARHAY ■ YVT VEAAPYWLQKPQSHLYGP 345 

Qy 269 GQTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDEK SLEISNITPTDEGTY 322 

1:11 I I II hi h :l :: hi II :lh hi 
Db 346 GETARLDCQVQGRPQPEVTWRIN — GMSIEKVNKDQKYRIEQGSLILSNVQPSDTMVT 401 
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Qy 323 VCEAHNNVGQISARASL* IVHAPPNFTRRPSNRRVGLNG • WQLPCMASGNPPPSVFWTK 380 

III I I : I I : :M : : : : | I I I I III 
Db 402 QCEARNQHGLLLANAYIYWQLPARILTKDNQTYMAVEGSTAYLLCRAFGAPVPSVQWLD 461 

Qy 381 EGVSTLMFPNSSHGRQYVAALX5TLQITDVRQEDEGYWCSAFSVVDSSTVRVFLQVSSVD 440 

I :|:: I : hi I I I:: Mill: :: |: III 
Db 462 EEGTTVL — QDERFFPYANGHLGIRDLQANDTGRYFCQAANDQNNVTILANLQVKEAT 517 

Qy 441 ERPPPIIQIGPANQTLPKGSVATLPCRATGNPS • -PRIKWFHDGHAVQA- * -GNRYSIIQ 495 

: I II : II: I |:|: :|| I I II :| ::| |: 
Db 518 Q ITQGPRSTIEKRGARVTFTCQASFDPSLQASITWRGDGRDWERGDSDKY-FIE 571 

Qy 496 GSSLRVDDLQLSDSGTYTCTASGERGET-SWAATLTVEKPGSTSLHRAADPSTYPAPPGT 554 

I : I II I 1:1 II I I I II I II :l 
Db 572 DGQLVIKSLDYSDQGDYSCVASTELDEVESRAQLLVVGSPGPVPHLELSDRHL 624 



555 PRVLNVSRTSISLRWARSQEKPGAVGPIIGYTVEYFSPDL-QTGWIVAAHRVGDTQVTIS 613 

: :: : I I: ::: II I :|: :: I |: I 
625 LKQSQVHLSWS PAEDHN - - -SPIERYDIEFEDKEMAPERWFSLGRVPGNQTSTTL 676 

614 GLT PGTS Y VFLVRAENTQG ISVPSGLSNVI KT I EADFDAASANDLSAA RTLLT 666 

1:1 I I I III II :| : I II I I : ., ::| 
677 KLSPYVHYTFRVTAINKYGPGEPSPVSETWTPEA- ■ -APEKNPVDVRGEGNETNNMVIT 733 



Db 



Qy 667 GKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDASVPSAQYHSITVMDASAE 726 

t I : :| :l |::| I : : I 

Db 734 WKPLRWMDWNAPQIQ • YRVQWR PLG KQETWKEQT VSDP 770 

Qy 727 SFWGNLKKYTKYEFFLTPFFETIEGQPSNSKTALTY— -EDVPSAPPDNIQIGMYNQT 782 

II I : II : : : I : :| II I I: I ::| : 
Db 771 FLWSNTSTFVPYEIRV — QAVNNQGKGPEPQVTIGYSGEDYPQVSPELEDITIFNSS 826 

Qy 783 AGWVRWTPPPSQHHNGNLYGYRI EVSAGNTMKVLANMTLNATTTSVLLNNL 833 

III I hi II : : I : | ::| ; III :|: I 

Db 827 TVLVRWRP VTMQVKGHLRGYNVT YWWKGSQRRHSKRHVHK ■ ■ SHMWPANTTSAILSGL 884 

Qy 834 TTGAVYSVRLNSFTKAGDGP YSKPISLFMDPTH HVHPPR 872 

: I I : :| I II :M : I I III 
Db 885 RPYSSYHVEVQAFNGRGLGPASEWTFSTPEGV- ■ -PGHPEALHLECQSDTSLLLHWQPPL 941 

Qy 873 AHPS G1H-DGRHEGQ DLTYHNNGNIPPG DINPTTHK 907 

:l III : I :l III: : III: 

Db 942 SHNGVITGYLLSYHPLDGESREQLFFNLSDPELRTHNLTNLNPDLQYRFQLQATTHQ 998 



t 



IULT 13 
H_CHICK 



AXOl.CHICR STANDARD; PRT; 1036 AA. 
P28685; 

01-DEC-1992 (Rel. 24, Created) 
01-DEC-1992 (Rel. 24, Last sequence update) 
15-JUL-1999 (Rel. 38, Last annotation update) 
AXONIN-1 PRECURSOR. 

Gallus gallus (Chicken). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Buteleostomi; 
Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 
Gallus. 
[1] 

SEQUENCE FROM N.A., AND PARTIAL SEQUENCE. 
TISSUE" BRAIN' 

MEDLINE-92174898; PubMed-1311675; 

Zuellig R.A., Rader C, Schroeder A., Ralousek M.B., 

von Bohlen Und Halbach F, ( Osterwalder T., Inan C, Stoeckli E.T., 

Affolter H.-u., Fritz A., Hafen E., Sonderegger P.; 

"The axonally secreted cell adhesion molecule, axonin-1. Primary 

structure, inmunoglobulin-like and fibronectin-type-HI-liJce domains 

and glycosyl-phosphatidylinositol anchorage."; 

Eur, J. Biochem. 204:453-463(1992). 

-!- FUNCTION: AXON- ASSOCIATED CELL ADHESION MOLECULE (AXCAM) WHICH 
PROMOTES NEURITE OUTGROWTH BY INTERACTION WITH THE AXCAM Ll (G4) 
OF NEURITIC MEMBRANE. 

-!- SUBCELLULAR LOCATION: ATTACHED TO THE NEURONAL MEMBRANE BY A 



GPI -ANCHOR. - 

-I- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
-I- SIMILARITY:, CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation • 
the European Bioinformatics Institute, There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; X63101; CAA44815.1; -. 
DR PIR; S22128; S22128. 
DR PIR; S22383; S22383. 
DR HSSP; P56276; ITLK. 
DR INTERPRO; IPR001777; -. 
DR INTERPRO; IPR003006; -. 
DR PFAM; PF00041; fn3; 3. 
DR PFAM; PF00047; ig; 6. 

KW Immunoglobulin domain; Glycoprotein; Signal; GPI-anchor; 
RW Cell adhesion;' Repeat. 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 

SQ SEQUENCE 1036 'AA; 113301 MW; 08B8014 3BE779794 CRC64; 



Query Match , 8,4*; Score 626.5; DB 1; Length 1036; 

Best Local Similarity 24.24; Pred. No. 3.2e-24; 

Matches 241; Conservative 130; Mismatches 390; mdels 233; Gaps 35; 

Qy 71 EPATLNCKVE5KPEPTIEWFKDGEPVSTNEKKSHRVQFKDGALFFYRTMQGK K 123 

I II h .: I II I : : I I II : I I 

Db 50 EKVTLTCRARANPPATYRW RMNGTELRMGPDSRYRLVAGDLVISNPVR 97 

Qy 124 EQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKD-TRVAKGETALLECGPPKGI 182 

:| I I IN; I I III III: I: :| I :| :: :| : I II 
Db 98 AKDAGSYQCVATNARGTWSREASLRFGFLQ-EFSAEERDPVKITEGWGVMFTCSPPPHY 156 

Qy 183 PEPTLIWIKDGVPLDDLKAMSFGASSRVRIVD--GGNLLISNVEPIDEGNYKCIAQNLVG 240 

I : h : I :| : I I 1 1 1 h I I 1 1 1 I I : : 
Db 157 PALSYRWLLNEFP NFIPADGRRFVSQTTGNLYIAKTEASDLGNYSCFATSHID 209 

Qy 241 --TRE-SSYAKLIV QVKPYF-MKEPKDQVMLYGQTATFHCSVGGDPPPKVLWK 289 

^| h I I'::! : II I I I I 1 1 I I I : I h : h 
Db 210 FITKSVFSKFSQLSLAAEDARQYAPSIKAKFPADTYALTGQMVTLECFAFGNPVPQIKWR 269 

Qy 290 REEGNIPVSRARILHDERSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTK 349 

I :|: •:: I I I I h lllll III I h : : :|:|| h: 
Db 270 RLDGS---QT3RWLSSEPLLHIQNVDFEDEGTYECEAENIKGRDTYQGRIIIHAQPDWLD 326 



SIGNAL 


1', 


23 


OR 25 (POTENTIAL) . 




CHAIN 


24' 


1036 


AXONIN-1. 




PROPEP 


7 


1036 


REMOVED IN MATURE FORM 




DOMAIN 


49 


113 


IG-LIRE C2-TYPE DOMAIN 




DOMAIN 


143 


211 


IG-LIRE C2-TYPE DOMAIN 




DOMAIN 


249'' 


308 


IG-LIKE C2-TYPE DOMAIN 




DOMAIN 


336 


397 


IG-LIKE C2-TYPE DOMAIN 




DOMAIN 


428. 


490 


IG-LIRE C2-TYPE DOMAIN 




DOMAIN 


518, 


589 


IG-LIRE C2-TYPE DOMAIN 




DOMAIN 


599' 


608 


HINGE (POTENTIAL) . 




DOMAIN 


601 


607 


GLY/PRO-RICH. 




DOMAIN 


608. 


709 


FIBRONECTIN TYPE-III. 




DOMAIN 


710 ; 


811 


FIBRONECTIN TYPE-III. 




DOMAIN 


812 


912 


FIBRONECTIN TYPE-III. 




DOMAIN 


913 


1009 


FIBRONECTIN TYPE-III. 




MODJES 


724 


?24 


BLOCKED. 




CARBOHYD 


71, 


71 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


CARBOHYD 


199 


199 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


CARBOHYD 


456 


456 


N-LINRED (GLCNAC. . .) 


(POTENTIAL) . 


CARBOHYD 


472' 


472 


N-LINRED (GLCNAC. . ,) 


(POTENTIAL). 


CARBOHYD 


.493, 


493 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


CARBOHYD 


520, 


520 


N-LINRED (GLCNAC. . .) 


(POTENTIAL) . 


CARBOHYD 


770 


770 


N-LINRED (GLCNAC. . .) 


(POTENTIAL) . 


CARBOHYD 


900' 


900 


N-LINKED (GLCNAC. . ,) 


(POTENTIAL). 


CARBOHYD 


914', 


914 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 
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0y 350 RPSNRRVGLNGWQLPCMASGNPPPSVFWTREG VSTLMFPNS- 391 

:: : : :: hill I hi I "I I h :| 

Db 327 VITDTEADIGSDLRWSCVASGKPRPAVRWLRDGQPLASQNRIEVSGGELRFSKLVLEDSG 386 

Qy 392 SHGRQY 397 

II I 

Db 387 MYQCVAENKHGTVYASAELTVQALAPDFRLNPVKRLIPAMSGKVIIPCQPRMPKATVL 446 

Qy 398 VAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDER 442 

: Mill : :: : III I I I : : : I I 
Db 447 WTKGTELLTNSSRVTITADGTLILQNISKSDEGKYTCFAENFMGKANSTGILSV R 501 



443 PPPIIQ IGPANQTLPKGSVATLPCRATGNPSPRI - - KWFHD GHAVQAGNR 490 

I : h: : I II I h :|: : II II :| : 

502 DATRITLAPSSADINVGENLTLQCHASHDPTMDLTFTWSLDDFPIDLDKSEGHYRRASVR 561 

491 YSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVERPGSTSLHRAADPSTYPA 550 

I : : II II Nil I :||||| I I 
562 EAV- - -GDLAIVNAQLKHSGRYTCTAQTWDSTSESATLTVRGP PG 604 

551 PPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDIiQTGWIVAAHRVGDTQV 610 

III I ••: h: I h: : II |::| : I I :: I 
605 PPGGVWRDIGDTTVQLSWSRGFDNH— SPIARYSIEARTL-LSNKW KQMRTNPV 656 

611 TISG LTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSAA 661 

II I I I I I I I h II h hi II : : I 

657 NIEGNAETAQWNLIPWMDYEFRVLASNILGVGEPSLPSSKIRTKEA-APTVAPSGLGGG 715 

662 RTLLTGRSVELIDASAINASAVRLEWM LHVSADEKYVEGLRIHYKDASVPSAQY 715 

I III II : :: :| :| :| : I II |: 

716 — GGAPNELP^ • ■ INWTPTLRDYQNGDGFGYILSFRKRGTQG — WLTARVPHAE ■ 762 

716 HSITVMDASAESFVVGN--LRRYTKYEFFLTPFFETIEGQPSNSRTALTY--EDVPSAPP 771 

: :| I : II :| : : :|: I lh I h I I 
763 SLHYVYRNESIGPYTPFEVRIRAY-NRRGEGPESLTAIVYSAEEEPRVAP 811 



Qy 


772 DNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYRIEV-SAGNTMRVIANMTLNATTTSVLL 830 


FT 


SIGNAL 


1 


24 








: : 1 1 1 II lh! h : : II : 


FT 


CHAIN 


25' 


1284 


NG-CAM RELATED CELL ADHESION MOLECULE. 


Db 


812 FRVTAKAVLSSEMDVSWEPVEQGDMTGVLLGYEIRYWKDGDKEEAADRVRTAGLVTSAHV 871 


FT 


DOMAIN 


25 


1143 


EXTRACELLULAR (POTENTIAL). 






FT 


TRANSMEM 


1144'. 


1166 


POTENTIAL. 




Qy 


831 NNLTTGAVYSVRLNSFTRAGDGPYSRPISLFMDPTHHVHPPMPSGTHDGRHEGQDLTY 890 


FT 


DOMAIN 


1167 


1284 


CYTOPLASMIC (POTENTIAL). 




1 1 : :: :ll II :: II -II 1 :| 


FT 


DOMAIN 


56 


125 


IG-LIRE C2-TYPE DOMAIN 




Db 


872 TGLNPNTKYHVSVRAYNRAGAGPPSPSTNITT TKPPPRRPPGNISWTLTGSTVTI 926 


FT 


DOMAIN 


155 


220 


16 -LIKE C2-TYPE DOMAIN 








FT 


DOMAIN 


261 


323 


IG-LIRE C2-TYPE DOMAIN 




Qy 


891 HNNGNIPPGDINPTT HKRTIDYLS 914 


FT 


DOMAIN 


. 351 


415 


IG-LIRE C2-TYPE DOMAIN 






: : 1 : 1 1 1 lh 


FT 


DOMAIN 


445. 


508 


IG-LIRE C2-TYPE DOMAIN 




Db 


927 KWDPWAQADESAVTGYKMLYRQDSHSAPTLYLA 960 


FT 


DOMAIN ' 


536' 


599 


IG-LIRE C2-TYPE DOMAIN 








FT 


DOMAIN 


638 


699 


FIBRONECTIN TYPE-III. 








■ FT 


DOMAIN 


738. 


799 


FIBRONECTIN TYPE-III. 




■SULT 14 


FT 


DOMAIN 


837 


906 


FIBRONECTIN TYPE-III. 






A_CHICK 


FT 


DOMAIN 


943 


1006 


FIBRONECTIN TYPE-III. 




ID 


NRCA_CHICR STANDARD; PRT; 1284 AA. 


FT 


DOMAIN 


1057 


1114 


FIBRONECTIN TYPE-III. 




AC 


P35331; 


FT 


DISULFID 


63' 


118 


POTENTIAL. 




DT 


01-FEB-1994 (Rel. 28, Created) 


FT 


DISULFID 


162: 


213 


POTENTIAL. 




DT 


01-FEB-1994 (Rel. 28, Last sequence update) 


FT 


DISULFID 


268. 


316 


POTENTIAL. 




DT 


15-JUL-1999 (Rel. 38, Last annotation update) 


FT 


DISULFID 


358 


408 


POTENTIAL, 




DE 


NG-CAM RELATED CELL ADHESION MOLECULE PRECURSOR (NR-CAM) (BRAVO). 


FT 


DISULFID 


452 


501 


POTENTIAL, 




OS 


Gallus gallus (Chicken). 


FT 


DISULFID 


543, 


592 


POTENTIAL. 




OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 


FT 


CARBOHYD 


78 


78 


N-LINRED (GLCNAC. , ,) 


(POTENTIAL). 


OC 


Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 


FT 


CARBOHYD 


218 


218 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


OC 


Gallus. 


FT 


CARBOHYD 


290 


290 


N-LINRED (GLCNAC. . .) 


(POTENTIAL) . 


RN 


[1] 


FT 


CARBOHYD 


409 


409 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


RP 


SEQUENCE FROM N.A., AND SEQUENCE OF 25-52; 178-184 AND 581-594. 


FT 


CARBOHYD 


483' 


483 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


RC 


STRAIN-WHITE LEGHORN; TISSUE-EMBRYONIC BRAIN; 


FT 


CARBOHYD 


576 


576 


N-LINRED (GLCNAC. . ,) 


(POTENTIAL). 


RX 


MEDLINE-91258407; PubMed-2045418; 


FT 


CARBOHYD 


581' 


581 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


RA 


Grumet M., Mauro v., Burgoon M.P., Edelman G.M., Cunningham B.A.; 


FT 


CARBOHYD 


595. 


595 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


RT 


"Structure of a new nervous system glycoprotein, Nr-CAM, and its 


FT 


CARBOHYD 


692 


692 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


RT 


relationship to subgroups of neural cell adhesion molecules."; 


FT 


CARBOHYD 


778 ' 


778 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


RL 


J. Cell Biol. 113:1399-1412(1991). 


FT 


CARBOHYD 


834 


834 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


RN 


[2] 


FT 


CARBOHYD 


885. 


885 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


RP 


SEQUENCE OF 25-1284 FROM N.A., AND PARTIAL SEQUENCE. 


FT 


CARBOHYD 


969 


969 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


RC 


TISSUE-EMBRYONIC BRAIN, AND RETINA; 


FT 


CARBOHYD 


985. 


985 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 



RX MEDLINE-92381110; PubMed-1512296; 

RA Rayyem J.F., Roman J.M., de la Rosa E.J., Schwarz U,, Dreyer W.J.; 

RT "Bravo/Nr-CAM is closely related to the cell adhesion molecules LI 

RT and Ng-CAM and has a similar heterodimer structure."; 

RL J, Cell Biol. 118:1259-1270(1992). 

CC -I- FUNCTION: THIS PROTEIN IS A CELL ADHESION MOLECULE INVOLVED IN 
CC NEURON- NEURON ADHESION, NEURITE FASCICULATION, OUTGROWTH OF 
CC NEURITES, ETC. SPECIFICALLY INVOLVED IN THE DEVELOPMENT OF OPTIC 
CC FIBRES IN THE RETINA. 

CC -I- SUBUNIT: HETERODIMER, COMPOSED OF AN ALPHA AND A BETA CHAIN. 

CC -I- SUBCELLULAR; LOCATION: TYPE I MEMBRANE PROTEIN. 

CC •!- ALTERNATIVE , PRODUCTS : AT LEAST 5 ISOFORMS ARE PRODUCED BY 
CC ALTERNATIVE iSPLICING. 

CC -I- TISSUE SPECIFICITY: RETINA AND DEVELOPING BRAIN. 

CC -I- DEVELOPMENTAL STAGE: EXPRESSED IN DEVELOPING NEURAL RETINA AND 

CC EMBRYONIC BRAIN TISSUE. 

CC -!- SIMILARITY: -CONTAINS 6 IMMUNOGLOBULIN-LIRE C2-TYPE DOMAINS. 

CC •!- SIMILARITY J : CONTAINS 5 FIBRONECTIN TYPE III -LIRE DOMAINS. 

cc I 

CC This SWISS-PROTl entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

cc 

DR EMBL; X58482; CAA41391.1; -. 

DR EMBL; L08960; AAA48632.1; -. 

DR HSSP; P20241; 1CFB. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 5. 

DR PFAM; PF0QQ47;'ig; 6. 

DR PRINTS; PR00014;. FNTYPEIII. 

KW Immunoglobulin domain; Glycoprotein; Signal; Cell adhesion; Repeat; 
Transmembrane; Alternative splicing. 
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995 


995 


a bxahLU (uLn.Nnl». . ,J JrVlGINll/UiJ . 


FT 


CARBOHYD 


1048 


1048 




FT 


CARBOHYD 


1059 


1059 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


1091 


1091 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


VARSPLIC 


612 


621 


MISSING (IN ISOFORM AS10). 


FT 


VARSPLIC 


1027 


1038 


MISSING (IN ISOFORM AS12). 


FT 


VARSPLIC 


1039 


1131 


MISSING (IN ISOFORM AS93). 


FT 


VARSPLIC 


1202 


1205 


MISSING (IN ISOFORM AS-CYT2). 


FT 


CONFLICT 


209 


209 


V -> E (IN REF. 2). 


FT 


CONFLICT 


680 


680 


H -> 0 (IN REF. 2). 


so 


SEQUENCE 


1284 AA; 141851 MW; A3570BF9C3D47A0F CRC64; 



Query Match 8.41; Score 624; DB 1; Length 1284; 

Best Local Similarity 24 .4%; Pred. No. 5.7e-24; 

Matches 237; Conservative 125; Mismatches 407; Indels 204; Gaps 36; 



I 



54 QSPRIIEH-PTDLWKKNEPATLNCRVEGKPEPTIEWFKDGEPVSTNEKKSHRVQFKDGA 112 

II I : I I :| I ; I: :||| |: I ::| :: 
39 QPPTITQQSPKDYIVDPRENIVIQCEAKGKPPPSFSWTRNGTHFDIDKDAQVTMKPNSGT 98 



Qy 113 LFFYRTMQG-KKEQDGGEYWCVAKNRVGQAVSRHASLQ-IAVLRDDFRVEPKDTRVAKG 169 

I I I I I I I I |;| I |:| : :: : I ::|| | :| 
Db 99 L-WNIMNGVKAEAYEGVYQCTARNERGAAISNNIVIRPSRSPLWTKEKLEPNHVR-EG 155 

Qy 170 ETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPID-E 228 

:: :| I II |:| I : |: I I II hi Ml: I 

Db 156 DSLVLNCRPPVGLPPPIIFWM DNAFQRLPQSERVSQGLNGDLYFSNVQPEDTR 208 

Qy 229 GNYKCIAQ-NLVGT--RESSYAKLIVQVKPYFMKEP KDQVMLYGQTATFHC 276 

M I I: I I " : :' II : I ::| I I I 

Db 209 VDYICYARFNHTQTIQQKQPISVKVFSTKPVTERPPVLLTPMGSTSNKVELRGNVLLLEC 268 

Qy 277 SVGGDPPPKVLWKKEEGNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISAR 336 

II I : I II I :| :| : :|:hl ::: MINIM 
Db 269 IAAGLPTPVIRWIKEGGELPANRTFFENFKKTLKIIDVSEADSGNYKCTARNTLGSTHHV 328 

Qy 337 ASLIVHAPPNFTKRPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQ 396 

I: I I I : II: II MM M I II M : 

Db 329 ISVTVKAAPYWITAPRNLVLSPGEDGTLICRANGNPKPSISWLTNGVPIAIAPEDPSRK- 387 

Qy 397 YVAADG-TLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQ- 454 

II I: : h: I M : Ml : II h III: 

Db 388 — VDGDTIIFSAVQERSSAVYQCNASNEYGYLLANAFVNVLA---EPPRILT--PANRL 439 



455 -TLPKGSVATLPCRATGNPSPRIKWFHD-GHAVQAGNRYSIIQGSSLRVDDLQLSDSGTY 512 

: I I : I hi I Ml :: II I :M I Mil 
440 YQVIADSPALIDCAYFGSPKPEIEWFRGVKGSILRGNEYVFHDNGTLEIPVAQRDSTGTY 499 



Qy 513 TCTASGERGETSWAA TLTVEKPGSTSLHRAA DPSTYP 549 

II I : IM I: ::M : IM ||: I 

Db 500 TCVARNKLGKTQNEVQLEVKDPTMIIKQPQYKVIQRSAQASFECVIKHDPTLIPTVIWLR 559 

Qy 550 APPGTPKV- 557 

MIM 

Db 560 DNNELPDDERFLVGKDNLTIMNVTDKDDGTYTCIVNTTLDSVSASAVLTWAAPPTPAII 619 

Qy 558 LNVSRT SISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTG-WIVAAHR 604 

I:: I III I M II : Ml Ml 
Db 620 YARPNPPLDLELTGQLERSIELSWVPGEENN— SPITNFVIEYEDGLHEPGVWHYQTEV 676 

Qy 605 VGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSAARTL 664 

I IM M II I I I I II I MM : 
Db 677 PGSHTTVQLKLSPYVNYSFRVIAVNEIGRSQPSEPSEQYLTKSANPDENPSN 728 



665 LTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDASVPSAQYH-SITVMDA 723 

I: M I : M : :M : =M II I I 
729 VQGIGSBPDN— LVITW ESLKGFQ SNGPGLQYKVSWRQKDV 767 

724 SAE--SFWGNLKKYTKYEFFLTPFFETIE GQPSNSKTALTY - - EDVPSAP 770 

I I III: II II I I I : : MM 

768 DDEWTSVWANVSKYIVSG—TPTFVPYEIKVQALNDLGYAPEPSEVIGHSGEDLPMVA 824 



Qy 771 PDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYKI EVSAGNTMKVLAN-MTLNA 823 

I IM: Ml MIM |:| |||: :| : | :| 

Db 825 PGNVQVHVINSTLAKVHWDPVPLKSVRGHLQGYKVYYWKVQSLSRRSKRHVEKKILTFRG 884 

Qy 824 TTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISLFMDPTHHVHPPR 872 

I -M I. : M : IMI II M I II 
Db 885 NKTFGMLPGLEPYSSYKLNVRWNGKGEGPAS-PDKVFKTPEGVPSPPSFLKITNPTLDS 943 



Qy 873 



■AHPSG 877 

IIM 

944 LTLEWGSPTHPNG 956 



RESULT 15 
CONT HUMAN 
ID 
AC 
DT 
DT 
DT 



PRT; 1018 AA. 



GN 



CONTJUMAN STANDARD; 
Q12860; Q12861; Q14030; 
01-NOV-1997 (Rel. 35, Created) 
01-NOV-1997 (Rel. 35, Last sequence update) 
Ol-OCT-2000 (Rel. 40, Last annotation update) 
CONTACT IN PRECURSOR (GLYCOPROTEIN GP135). 
CNTNl. 

Homo sapiens (Human) , 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
11] 

SEQUENCE FROM N.A., AND PARTIAL SEQUENCE, 
TISSDE-BRAIN; 

MEDLINE-95048335; PubMed-7959734 ; 
Berglund E.O., Ranscht B.; 

"Molecular cloning and in situ localization of the human contactin 
gene (CNTNl) on chromosome 12qll-ql2,"; 
Genomics 21:571-582(1994). 
[2] 

SEQUENCE FROM N.A., AND CHARACTERIZATION. 
MEDLINE-94217459; PubMed-8164510; 
Reid R.A., Hemperly J.J.; 

"Identification, and characterization of the human cell adhesion 

molecule contacting; 

Brain Res. Hoi. Brain Res. 21:1-8(1994). 

-I- FUNCTION: MEDIATES CELL SURFACE INTERACTIONS DURING NERVOUS 

SYSTEM DEVELOPMENT. 
-!- SUBCELLULAR LOCATION: ATTACHED TO THE MEMBRANE BY A GPI -ANCHOR. 
•!• ALTERNATIVE. PRODUCTS : 2 ISOFORMS; 1 (SHOWN HERE) AND 2; ARE 

PRODUCED BY 'ALTERNATIVE SPLICING. 
-!- SIMILARITY; 'CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
-!- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS . 

This SWISS-PROT' entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch) . 

EMBL; U07819; AAA67920.1; -. 
EMBL; U07820; AAA67921.1; -. 
EMBL; 221488; CAA79696.1; -. 
HSSP; P40189; 1BQU. 
MIM; 600016; -,' 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 4. 
PFAM; PF00047; ig; 6. 

Immunoglobulin 'domain; Glycoprotein; Signal; GPI -anchor; 
Celljadhesion; Repeat; Alternative splicing, 



SIGNAL 


r. 


20 




CHAIN 


21* 


? 


CONTACTIN. 


PROPEP 


?, 


1018 


REMOVED IN MATURE FORM. 


DOMAIN 


58 


121 


IG-LIKE C2-TYPE DOMAIN. 


DOMAIN 


151 


218 


IG-LIKE C2-TYPE DOMAIN. 


DOMAIN 


256 ■ 


317 


IG-LIKE C2-TYPE DOMAIN. 
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FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

•CARBOHYD 
CARBOHYD 
VARSPLIC 

FT CONFLICT 



Qy 



398 
491 
590 
609 
710 
812 
908 



909 1004 
65 114 



158 
263 
352 
436 
526 



211 
310 
391 
484 
583 
208 
258 
338 
457 
473 
494 
521 
591 
933 
31 
798 

1018 AA; 113320 



258 
338 
457 
473 
494 
521 
591 
933 
21 



IG-LIKE C2-TYPE DOMAIN, 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
GLY/PRO-RICH. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY, 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
N-LINKED (GLCNAC' 
N-LINRED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
MISSING (IN ISOFORM 2) 
V -> L (IN REF. 2). 
MW; 4B8FDC5BFD434ED5 CRC64 ; 



(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) , 
(POTENTIAL) . 
(POTENTIAL) , 



Query Match 8.3%; Score 614.5; DB 1; Length 1018; 

Best Local Similarity 22.8%; Pred. No. 1.3e-23; 

Matches 234; Conservative 143; Mismatches 404; Indels 245; Gaps 35; 

28 RMWLLPAWLLLVLVAS NGLPAVRGQYQSPRIIEHPTDLWKKNE" • -P 72 

:IHI : I::: : : :|: : : I I I : : : 

2 KMWLLVSHLVIISITTCLAEFTWYRRYGHGV-SEEDKGFGPIFEEQPINTIYPEESLEGK 60 

73 ATLNCKVEGKPEPTIEWFKDGEPVSTNEKKSHRVQFKDGALFFYRTMQGKKEQDGGEYWC 132 

:MI: I :| : I : I I I |::| I |:| 

61 VSLNCRARASPFPVYKWRMNNGDV- - -DLTSDRYSMVGGNLVINNP- - -DKQKDAGIYYC 114 



Oy 133 VAKNRVGQAVSRHASLQIAVLRDDFRVEPK-DTRVAKGETALLECGPPKGIPEP-TLIWI 190 

:| I I I hi I ! I I : : II :h =1 I I! I: : |: 
Db 115 LASNNYGMVRSTEATLSFGYL-DPFPPEERPEVRVKEGKGMVLLCDPPYHFPDDLSYRWL 173 

Qy 191 KDGVPLDDLKAMSFGASSRVRIVD--GGNLLISNVEPIDEGNYKCIAQNLVGTRESSYAK 248 

: I: I : II III hill hill I : h I ::| 

Db 174 LNEFPV FITMDKRRFVSQTNGNLYIANVEASDKGNYSCFVSSPSITK-SVFSK 225 

Qy 249 LIVQV KPY---FMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWKKEEGNIPVS 298 

»l : III : : II I II I I hi I : hi :| I 
226 FIPLIPIPERTTKPYPADIWQFKDVYALMGQNVTLECFALGNPVPDIRWRKVLEPMP-S 284 

Qy 299 RARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPSNKKVGL 358 

I I hi II III I III I h :| : II I : : :: :l : 
Db 285 TAEISTSGAVLKIFNIQLEDEGIYECEAENIRGKDKHQARIYVQAFPEWVEHINDTEVDI 344 

Qy 359 NGWQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYV 418 

: 1 1 : 1 : 1 I h= I! I I I h: II h I I 

Db 345 GSDLYWPCVATGKPIPTIRWLKNG YAYHKGELRLYDVTFENAGMYQ 390 

Qy 419 CSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTL-PKGSVATLPCRATGNPSPRI 476 

II:: h: :: I :: I : : || : h I h 
Db 391 CIAENTYGAIYANAELKILAL — APTFEMNPMKKKILAAKGGRVIIECKPKAAPKPKF 446 

Qy 477 KWFHDGHAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKP- 534 

I : :| I : || :::: :| I III I ||: i II : I 
Db 447 SWSKGTEWLVNSSRILIWEDGSLEINNITRNDGGIYTCFAENNRGKANSTGTLVITDPTR 506 

Qy 535 GSTSLHPAA— DPS— TY 548 

I : : I lh h 
Db 507 IILAPINADITVGENATMQCAASFDPALDLTFVWSFNGYVIDFNKENIHYQRNFMLDSNG 566 



I III :: :: ll::| |:: 
Db 567 ELLIRNAQLKHAGRYTCTAQTIVDNSSASADLWRGPPGPPGGLRIEDIRATSVALTWSR 626 

Qy 572 SQEKPGAVGPIIGYTVE YFSPDLQTGWIVAAHRVGDTQVTISGLTPGT 619 

: I: lh: I : I : II I I I 

Db 627 GSDNH---SPISKYTIQTKTILSDDWKDAKTDPPIIEGNMEAARAV DLIPWM 675 

Qy 620 SYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASAN DLSAARTLLTGKSVELIDA 675 

I I I I II I II II III 11:1 h h: II 

Db 676 EYEFRWATNTLGRGEPSIPSNRIKT DGAAPNVAPSDVGGG GGRNREL- - - 723 

Qy 676 SAINASAVRLEWMLHVSADEKYVEGLRIHYKDASVP--SAQYHSITVMDASAESFWGN- 732 

:'l :| I -I I I :: :|| : :| : 

Db 724 TITW--APLSREYHYGNNFGYIVAFKPFDGEEWKKVTVTNPDTGRYVHKDE 772 

Qy 733 -LKKYTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPP 791 

: I :: ': I =11 : : I 1 1 I : : : : : I I 
Db 773 TMSPSTAFQVKVKAFNNKGDGPYSLVAVINSAQDAPSEAPTEVGVKVLSSSEISVHW- - ■ 829 

Qy 792 PSQHHNGNLY-GYKIEVSAGNTMKV1AN-MTLNATTTSVLLNNLTTGAVYSVRLNSFTKA 849 

:| : ; |:| I : : || : : : I I II I : : : I 
Db 830 "EHVLEKIVESYQIRYWAAHDKEEAANRVQVTSQEYSARLENLLPDTQYFIEVGACNSA 887 

Qy 850 GDGPYSKPISLFMDPTHHVHPPRA HPS 876 

I II I I III I 

Db 888 GCGPPSDMIEAFTKKAPPSQPPRIISSVRSGSRYIITWDHWALSNESTVTGYKVLYRPD 947 

Qy 877 GTHDGR 882 ' 

I III: 
Db 948 GQHDGK 953 



Search completed: January 22, 2001, 12:27:] 
Job time: 1139 sec .• 



Qy '549 PAPPGTPKVLNVSRTSISLRWAK 571 
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GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd, 



OM protein ■ protein search, using sw model 
Run on: 



January 22, 2001, 12:50:23 ; Search time 559.88 Seconds 
(without alignments) 
292,036 Million cell updates/sec 



. . SCLYAEAGEPAPRQMTAKNT 1395 



Title: US-09-540-245A-15 
Perfect score: 7427 
Sequence: 1 MHPMHPENHAIARSTSTTNN,. 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

parched: 374700 seqs, 117207915 residues 

Total number of hits satisfying chosen parameters; 374700 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

SPTREMBL_15:« 



sp.archea : * 
spjacteria:* 
spjungi:* 
spjiuman:* 
sp.invertebrate:* 
spjiammal:* 
spjihc;* 
sp_organelle:* 
sp_phage : * 

sp_plant:* 

sp.rodent:* 

sp_virus:* 

sp.vertebrate:* 

sp.unclassified:' 



Pred. No, is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



% 

Query 



No. 


Score 


Match Length 


DB 


ID 


Description 


1 


7427 


100.0 


1395 


5 


044924 


044924 drosophila 


2 


7407 


99.7 


1395 


5 


Q9W213 


Q9w213 drosophila 


3 


1620 


21.8 


859 


5 


Q9VPZ6 


Q9vpz6 drosophila 


4 


1609.5 


21.7 


1273 


5 


044928 


044928 caenorhabdi 


5 


1607.5 


21.6 


1612 


11 


089026 


089026 mus musculu 


6 


1592 


21.4 


1651 


4 


Q9Y6N7 


Q9y6n7 homo sapien 


7 


1585 


21.3 


1651 


11 


055005 


055005 rattus norv 


8 


1499 


20.2 


1060 


11 


Q9QZI3 


Q9qzi3 rattus norv 


9 


1430 


19,3 


823 


5 


Q9VQ10 


Q9vql0 drosophila 


10 


1395.5 


18.8 


1344 


11 


Q9Z2I4 


Q9z2i4 mus musculu 


11 


826,5 


11.1 


423 


5 


P91572 


P91572 caenorhabdi 


12 


790 


10.6 


874 


5 


001632 


001632 caenorhabdi 


13 


759,5 


10.2 


1377 


11 


P97603 


P97603 rattus norv 


14 


731.5 


9.8 


1493 


11 


P97798 


P97798 mus musculu 


15 


712.5 


9.6 


1461 


4 


000340 


000340 homo sapien 


16 


710.5 


9.6 


1461 


4 


Q92859 


Q92859 homo sapien 


17 


686 


9.2 


1277 


13 


098902 


Q98902 fugu rubrip 


18 


677 


9.1 


1443 


13 


Q90610 


Q90610 gallus gall 


19 


667.5 


9.0 


1822 


4 


Q90LT7 


Q9ult7 homo sapien 



20 


663,5 


8.9 


1026 


4 


094780 


094780 homo sapien 


21 


659.5 


8.9 


1100 


4 


094779 


094779 homo sapien 


22 


652 


8.8 


2016 


5 


Q9V4J9 


Q9v4 j9 drosophila 


23 


651 


8.8 


1026 


11 


062845 


Q62845 rattus norv 


24 


651 


8.8 


2221 


5 


Q9U1M1 


Q9ulml drosophila 


25 


649 


8.7 


2016 


5 


Q9NBA1 


Q9nbal drosophila 


26 


647 


8.7 


1028 


11 


052682 


Q62682 rattus norv 


27 


645,5 


8.7 


1445 


11 


063155 


Q63155 rattus norv 


28 


645 


8.7 


1028 


11 


Q07409 


Q07409 mus musculu 


29 


644 


8.7 


1272 


13 


090924 


Q90924 gallus gall 


30 


643 


8.7 


1217 


11 


P97685 


P97685 rattus norv 


31 


636 


8.6 


2222 


5 


097394 


097394 drosophila 


32 


635 


8.5 


1099 


11 


P97527 


P97527 rattus norv 




632 


8.5 


1264 


5 


P91767 


P91767 manduca sex 


34 


631 


8.5 


1369 


13 


042414 


042414 gallus gall 


35 


629 


8.5 


1259 


11 


Q9QY38 


Q9qy38 mus musculu 


36 


628 


8.5 


1028 


11 


P97528 


P97528 rattus norv 


37 


626,5 


• 8.4 


1248 


6 


Q9XT41 


Q9xt41 cercopithec 


38 


624 


8^4 


1028 


11 


09JMB8 


09jmb8 mus musculu 


39 


622.5 


8.4 


1427 


13 


Q91562 


Q91562 xenopus lae 


40 


619.5 


8.3 


1018 


6 


028106 


Q28106 bos taurus 


41 


616,5 


8.3 


1302 


5 


061542 


061542 drosophila 


42 


616 


8.3 


1166 


11 


09QVN4 


Q9qvn4 rattus sp. 


43 


612.5 


8.2' 


1151 


11 


Q9QVN5 


Q9qvn5 rattus sp. 


44 


612,5 


8.2 


1215 


11 


P97686 


P97686 rattus norv 


45 


611 


8.2 


1239 


5 


Q9V3X0 


Q9v3x0 drosophila 



PRELIMINARY; PRT; 1395 AA. 



RESULT 1 
044924 
ID 044924 
AC 044924; 
DT Ql-JUN-1998 (TrEMBLrel. 06, Created) 
DT Ql-JUN-1998 (TrEMBLrel. 06, Last sequence update) 
DT 01-JUN-2000 (TrEMBLrel, 14, Last annotation update) 
- P 1. ■ 



Drosophila melanogaster (Fruit fly) . 

Euxaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

Ephydroidea; Drosophilidae; Drosophila. 

NCBI TaxID-7227; 

[1] 

SEQUENCE FROM N.A. 
MEDLINE-98117249; PubMed-9458045; 

Kidd T. , Brose R., Mitchell K.J., Fetter R.D., Tessier-Lavigne M., 
Goodman C.S., Tear G.; 

"Roundabout controls axon crossing of the CNS midline and defines a 

novel subfamily of evolutionarily conserved guidance receptors . " ; 

Cell 92:205-215(1998), 

EMBL; AF040989; AAC38849.1; -. 

HSSP; P56276; 1TLR. 

FLYBASE; FBgn0035631; robo, 

INTERPRO; IPR001777; ■. 

INTERPRO; IPR003006; -. 

PFAM; PF00041; fn3; 3, 

PFAM; PF00047; ig; 5. 

PRINTS; PR00014; FNTYPEIII. 

1395 AA; 151778 MW; B820E234A5218983 CRC64; 



Query Match 100,0%; Score 7427; DB 5; Length 1395; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1395; Conservative 0; Mismatches 0; Indels 0; Gaps 

Oy 1 MHPMHPENHAIARSTSTTNNPSRSRSSRMWLLPAWLLLVLVASNGLPAVRGQYQSPRIIE 60 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
Db 1 MHPMHPENHAIARSTSTTNNPSRSRSSRMWLLPAWLLLVLVASNGLPAVRGQYQSPRIIE 60 

Qy 61 HPTDLWKKN3PATLNCPEGKPEPTIEWFKDGEPVSTNEKKSHRVQFKDGALFFYRTMQ 120 



Best Available Copy 
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Db 61 HPTDLWRKNEPATLNCKVEGRPEPT IEWFKDGEPVST NERKSHRVQFKDGALFFYRTMQ 120 

H? 

Qy 121 GKKEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETALLECGPPK 180 



Db 121 gkkeqdgg|ywcvaknrvgqavsrhaslqiavlrddfrvepkdtrvakgetallecgppk 180 

Qy *181 GIPEPTLI|IKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKCIAQNLVG 240 

Db 181 GIPEPTLltflKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKCIAQNLVG 240 

Qy 241 TRESSYAKl'iVQVKPYFMREPRDQVMLYGQTATFHCSVGGDPPPKVLWRREEGNIPVSRA 300 

Db 241 TRESSYAKlSlVQVKPYFMREPKDQVMLYGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRA 300 

Qy 301 RILHDEKSEEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPSNKKVGLNG 360 

Db 301 RILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPSNKKVGLNG 360 

Qy 361 VVQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCS 420 

J| 361 WQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCS 420 

Ty 421 AFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVAILPCRATGNPSPRIKWFH 480 

Db 421 AFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVRTLPCRATGNPSPRIKWFH 480 

Qy 481 DGHAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATIjTVERPGSTSLH 540 

Db 481 DGHAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVERPGSTSLH 540 

Qy 541 RAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIV 600 

Db 541 RAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIV 600 

Qy 601 AAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSA 660 

Db 601 AAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSA 660 

Qy 661 ARTLLTG KSVELIDASAI NAS AVRLEWMLHVSADEKYVEGLRIH YKDASVPS AQ Y HS ITV 720' 

Db 661 ARTLLTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDASVPSAQYHSITV 720 

Qy 721 MDASAESFWGNLKKYTKYEFFLTPFFETIEGQPSNSRTALTYEDVPSAPPDNIQIGMYN 780 

Db 721 MDASAESFWGNLKKYTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYN 780 

Qy 781 QTAGWRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYS 840 

«781 QTAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYS 840 
841 VRLNSFTRAGDGPYSRPISLFMDPTHHWPRAHPSGTHDGRHEGQDLTYHNNGNIPPGD 900 

Db 841 VRLNSFTKAGDGPYSKPISLFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGD 900 

Qy 901 INPTTHKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTKELGHLSWSDNE 960 

Db 901 INPTTHKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTKELGHLSWSDNE 960 

Qy 961 ITALNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGGTDYAEVD 1020 

Db 961 ITALNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGGTDYAEVD 1020 

Qy 1021 TRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETCTKTTSISADRDSGTHSPYSDAFAGQV 1080 

Db 1021 TRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETCTKTTSISADKDSGTHSPYSDAPAGQV 1080 

Qy 1081 PAVPWRSNYLQYPVEPINWSEFLPPPPEHPPPSSTYGYAQGSPESSRKSSKSAGSGIST 1140 

Db 1081 PAVPWRSNYLQYPVEPINWSEFLPPPPEHPPPSSTYGYAQGSPESSRKSSRSAGSGIST 1140 

Qy 1141 NQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPTNQHP 1200 



Db 1141 NQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPTNQHP 1200 

Qy 1201 PQLPAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPM 1260 
Db j 1201 PQLPAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPM 1260 

Qy 1261 QPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSECSDHSRSSQSHKRQLQLEEHGS 1320 

Db 1261 QPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSECSDHSRSSQSHKRQLQLEEHGS 1320 

Qy 1321 SMQRGGHHRRRAPWQPCMESENENMLAEYEQRQYISDCCNSSREGDTCSCSEGSCLYA 1380 

Db 1321 SAKQRGGHHRRRAPVVQPCMESENENMLAEYEQRQYTSDCCNSSREGDTCSCSEGSCLYA 1380 

Qy 1381 EAGEPAPRQMTAKNT 1395 

Db 1381 EAGEPAPRQMTAKNT 1395 



RESULT 2 
Q9W213 

ID Q9W213 PRELIMINARY; PRT; 1395 AA. 

AC i Q9W213; , : 

DT 01-MAY-2Q00 (TrEMBLrel, 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT I 01 -OCT -2000 (TrEMBLrel. 15, Last annotation update) 

DE ROBO PROTEIN. . 

GN ROBO. 

OS Drosophila melanogaster (Fruit fly). 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BERKELEY; 

RX MEDLINE-20196006; PubMed=10731132; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G,/ Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell m.d,, Zhang Q., Chen L.x,, 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M,, Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril j.p., Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L,, Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D. ( Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein p., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H,, Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A,, Deng Z., Mays A.D., Dew I,, Dietz S.M., 

RA Dodson K. ( Doup-L.E., Downes M. ( Dugan-Rocha S., Dunkov B.C., Dunn P,, 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann w. ( 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., 

RA Glodek A., Gong F. , Gorrell J.H., Gu Z., Guan P., Harris M., 

RA Harris N.L., Harvey D. ( Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A., Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M., Kalush F., Rarpen G.H., Re l., Kennison J. A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S,, Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A.A., Li J., Li Z. ( Liang Y. ( Lin X,, 

RA Liu X., Mattel B. ( Mcintosh T.C., McLeod M.P., McPherson D. ( 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A,, 

RA Mount S.M., Moy:M., Murphy B,, Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K. , Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri v., Reese M.G., 

RA Reinert K., Remington K. , Saunders R.D.C., Scheeler F. , Shen H, , 

RA Shue B.C., Siden-Kiamos I., Simpson M., skupski M.P., Smith T., 

RA Spier E., Spradling A.C., Stapleton M,, Strong R. ( Sun E., 

RA Svirskas R. , Tertor C. ( Turner R. ( Venter E. ( Wang A.H., Wang X., 

RA Wang Z.-Y. ( Wassarman D.A., Weinstock G.M., Weissenbach J,, 

RA Williams S.M., vioodage T. , Worley R.C., Wu D., Yang S,, Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q,, Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong w., Zhou X., zhu S., zhu X,, Smith H.O., 

RA Glbbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 
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RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003458; AAP46887.1; -. 

DR HSSP; P56276; 1TLK. 

DR FLYBASE; PBgnO0O5631; robo. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5, 

DR PRINTS; PR00014; FNTYPEUI, 

SQ SEQUENCE 1395 AA; 151759 MW; 25CED7DEB44F13F0 CRC64; 



Query)Match 99.74; Score 7407; DB 5; Length 1395; 

Best Local Similarity 99.7*; Pred. No. 0; 

Matches 1391; Conservative 2; Mismatches 2; Indels 0; Gaps 

tl MHPMHPENHAIARSTSTTNNPSRSRSSRMWLLPAWLLLVLVASNGLPAVRGQYQSPRIIE 60 
1 MHPMHPENHAIARSTSTTNNPSRSRSSRMWLLPAWLLLVLVASNGLPAVRGQYQSPRIIE 60 

Oy 61 hptdIwkrnepatlnckvegkpeptiewfkdgepvstnekkshrvqfkdgalffyrtmq 120 

Db 61 hptdlwkknepatlnckvegkpeptiewfkdgepvstnekkshrvqfkdgalffyrtmq 120 

Qy 121 gkkeqdggeywcvaknrvgqavsrhaslqiavlrddfrveprdtrvakgetallecgppk 180 

Db 121 gkkeodggeywcvaknrvgqavsrhaslqiavlrddfrvepkdtrvakgetallecgppk 180 

Qy 181 GIPEPTLIWIKDGVPLDDLRAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKCIAQNLVG 240 

Db 181 GIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKCIAQNLVG 240 

Qy 241 TRESSYARLIVQVKPYFMREPKDQVMLYGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRA 300 

Db 241 TRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRA 300 

Qy 301 RILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPSNKKVGLNG 360 

Db 301 RILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPSNKRVGLNG 360 

Qy 361 VVQLPCMASGNPPPSVTWTKEGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYyVCS 420 

Db 361 WQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQHVAADGTLQITDVRQEDEGYYVCS 420 

^y 421 AFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFH 480 
421 AFSWDSSTVRVFLQVSSLDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFH 480 

Qy 481 DGHAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTSLH 540 

Db 481 DGHAVQAGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTSLH 540 

Qy 541 RAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIV 600 

Db 541 RAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIV 600 

Qy 601 AAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSA 660 

Db 601 AAQRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNTIRTIEADFDAASANDLSA 660 

Qy 661 ARTLLTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDASVPSAQYHSITV 720 

Db" 661 ARTLLTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDASVPSAQYHSITV 720 

Qy 721 MDASAESFWGNLKKYTKYEFFLTPFFETIEGQPSNSRTALTYEDVPSAPPDNIQIGMYN 780 

Db 721 MDASAESFWGNLKKYTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYN 780 

Qy 781 QTAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYS 840 

Db 781 QTAGWTOWTPPPSQHHNGNLYGYKIEVSAGNTMKVUMLNATTTSVlLNmTTGAVYS 840 



Qy 841 VRLNSFTKAGDGPYSKPISLFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGD 900 
Db 841 VRLNSFTKAGDGPYSKPISLFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGD 900 
Qy 901 INPTTHKRTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRRHQMTRELGHLSWSDNE 96& 
Db 901 INPTTHKRTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTRELGHLSWSDi;5 3SC 
Qy 961 ITALNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGGTDYAEVD 1C20 
Db 961 ITALNINSRESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGGTDYAEVD 1020 

Qy 1021 TRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETCTKTTSISADKDSGTHSPYSDAFAGQV 1080 

Db 1021 TRNLTTFYNCRRSPDNPTPYATTMIIGTSSSETCTRTTSISADKDSGTHSPYSDAFAGQV 1080 

Qy 1081 PAVPWKSNYLQYPVEPINWSEFLPPPPEHPPPSSTYGYAQGSPESSRKSSKSAGSGIST 1140 

Db 1081 PAVPWRSNYLQYPVEPINWSEFLPPPPEHPPPSSTYGYAQGSPESSRKSSRSAGSGIST 1140 

Qy 1141 NQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPTNQHP 1200 

Db 1141 NQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPTNQHP 1200 

Qy 1201 PQLPAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPM 1260 

Db 1201 PQLPAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPM 1260 

Qy 1261 QPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSECSDHSRSSQSHRRQLQLEEHGS .1320 

Db 1261 QPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSECSDHSRSSQSHRRQLQLEEHGS 1320 

Qy 1321 SARQRGGHHRRRAPWQPCMESENENMLAEYEQRQYTSDCCNSSREGDTCSCSEGSCLYA 1380 

Db 1321 SAKQRGGHHRRRAPWQPCMESENENMLAEYEQRQYTSDCCNSSREGDTCSCSEGSCLYA 1380 

Qy 1381 EAGEPAPRQMTAKNT 1395 

Db 1381 EAGEPAPRQMTAKNT 1395 

RESULT 3 
Q9VPZ6 

ID Q9VPZ6 PRELIMINARY; PRT; 859 AA. 

AC Q9VPZ6; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel, 15, Last annotation update) 

DE CG5423 PROTEIN '(FRAGMENT) . 

GN CG5423. 

OS Drosophila melanogaster (Fruit fly). 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drbsophilidae; Drosophila. 

OX NCBI_TaxID»7227'; ■ 

RN [1] . ' 

RP SEQUENCE FROM N.A. 

RC STRAIN-BERKELEY; 

RX MEDLINE-20196006; PubMed-10731132; 

RA Adams M.D., Celaiker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G.> Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A., Levis S.E., Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan KX, Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D,, Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng z., Mays A.D., Dew I,, Dietz S.M., 

RA Dodson K., Doup'L.E., Downes M, , Dugan-Rocha S., Dunkov B.C., Dunn P., 



Mon Jan 22 13:04:21 2001 



us-09-540-245a-15.rspt 



Page 4 



Durbin K.J., Evangelista C.C, Ferraz c, Ferriera S., Fleischmann W., 
Fosler c, Gabrielian A.E., Garg N.S., Gelbart W.M,, Glasser K,, 
Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M., 
Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 
Hostin D, , Houston R.A., Howland T.J., Wei M.-H., Ibegwam C, 
Jalali M., Kalush F., Karpen G.H., Ke z., Kennison J. A., Ketchum R.A., 
Kirarael B.E., Kodira CD., Kraft C, Rravitz S., Rulp D., Lai Z,, 
Lasko P., Lei if., Levitsky A.A., Li J., Li z., Liang Y., Lin X., 
Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D. , 
Merkulov G., Milshina N.V., Mobarry c, Morris J., Moshrefi A,, 
Mount S.M., Moy M. , Murphy B., Murphy L,, Muzny D.M., Nelson D.L., 
Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 
Palazzolo M., Pittman G.S., Pan S., Pollard J., Purl V., Reese M.G., 
Reiner t K., Remington K., Saunders R.D.C., Scheeler P., Shen H., 
Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T., 
Spier E., Spradling A,C, Stapleton M., Strong R., Sun E., 
Svirskas R. , Tector C, Turner R., Venter E., Wang A.H., Wang X., 
Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 
Williams S.H., Woodage I., Worley K.C., Wu D., Yang S., Yao Q.A., 
Ye J., Yen R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L., 
Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 
Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 
"The genome sequence of Drosophila melanogaster."; 
Science 287:2185-2195(2000). 
EMBL; AE003586; AAF51388.1; -, 
HSSP; P56276; 1TLK. 
FLYBASE; FBgn0031328; CG5423. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PP00041; fn3;j3. 
PFAM; PF00047; ig; 5. 
NONJER 1 1 

SEQUENCE 859 AA; 93916 MW; 5CFD69D9841Q1BF8 CRC64; 



Query Match 21.8%; Score 1620; DB 5; Length 859; 

Best Local Similarity 38.6*; Pred. No. 8.6e-101; 

Matches 357; Conservative 160; Mismatches 312; Indels 96; Gaps 23; 

Qy 56 PRIIEHPTDLWKKNEPATLNCKVEGKPEPTIEWFKDGEPVSTNERKSHRVQFKDGALFF 115 

111:111 I I II I llhhlll h llh I III 

Db 1 PRIVEHPIDTTVPRHEPATLNCRAEGSPTPTIQWYRDGVPLRI-LPGSHRITLPAGGLFF 59 

Qy 116 YRTMQGRREQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETALLE 175 

: h: 1 1 1 1 1 : 1 1 : 1 1 :: 1 1 : 1 : 1 : 1 1 1 1 1 

Db 60 LKVSDGRR CAVLRDEFRLEPQNTRIAQGDTALLE 93 

Qy 176 CGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKCIA 235 
I 1:111111: I I I II I IIIIIIIMI I : III Mil 

94 CAAPRGIPEPTVTWRRGGQRLD LEGSRRVRIVDGGNLAIQDARQTDEGQYQCIA 147 



,236 QNLVGTRESSYAKLIVQVKPYFMKEPRDQVMLYGQTATFHCSVGGDPPPKVLWKK--EEG 293 
:| II Mil I I I Mil :: I II :| I : II I INI III : I 
Db 148 RNPVGVRESSLAfLRVHVKPYIIRGPHDQTVLEGASVTFPCRVGGDPMPDVLWLRTASGG 207 

Qy 294 NIPVSRARILHDERSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNPTRRPSN 353 

1:1: I :| I :ll : :| III I III I II hi :| hill I :||" 
Db 208 NMPLDRVSVLED-RSLRLERVTIADEGEYSCEADNVVGAITAMGTLTVYAPPRFIQRPAS 266 

Qy 354 KRVGLNGWQLPCMASGNPPPSVFWTREGVSTLMFPNSS HGRQYVAADGTLQITD 408 

II I I I III h:|ll : llhll : I I :| 

Db 267 RSVELGADTSFECRAIGNPRPTIFWTIKNNSTLIFPGAPPLDRFHSLNTEEGHSILILTR 326 

Qy 409 VRQEDEGYYV-CSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPRGSVATLPCR 467 

:: I: : I : I : I I I 1 1 I : I ::|ll!ll II Mill hill h 
Db 327 FQRTDKDLVILCNAMNEVASITSRVQLSLDSQEDRPPPIIISGPVNQTLPIRSLATLQCR 386 

Qy 468 ATGNPSPRIRWFHDGHAVQAGNRYSIIQGSSLRVDDL-QLSDSGTYTCIASGERGETSWA 526 

I I III I h II II :: :l I : II : I I III II |:::|: 
Db 387 AIGLPSPTISWYRDGIPVQPSSRLNITTSGDLIISDLDRQQDQGLYTCVASSRAGKSTWS 446 

Qy 527 ATLTVERPGSTSL-HRAADPSTYPAPPGTPKVLNVSRTSISLRWARSQERPGAVGPIIG 584 

I :| I : :: :|| : : |: II Ihll : ::::: I Mil :| 



Db 447 GFLRIELPTNPNIRFYRAPEQTRCPSAPGQPKILNATASALTIVWPTS-DRAGA-SSFLG 504 

Qy 585 YTVEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIK 644 

1:11 : : ; II I h : hill : 1 : 1 : 1 1 1 1 1 : I I II :| I 
Db 505 YSVEMYCTNQGRTWIPIASRLSEPIFTVESLTQGAAYMFIVRAENSLGFSPPSPISEPIT 564 

Qy 645 TIE ADFDAASANDLSAARTLLTGRS-VELIDASAINASAVRLEWMLHVSADER 696 

: : : I II III lll::hl ::: II I : : 

Db 565 AGKLVGVRDGSESTGTSQLLLSDVETLLQANDWELLEANASDSTTARLSWDID---SGQ 621 

Qy 697 YVEGLRIHYRDASVPSAQYHSITVMD--ASAESFWGNLRKYTKYEFFLIPFFETIEGQP 754 

hll :: :: : h:l :|::: MM: Hill lh::l hi 
Db 622 YIEGFYLYARE-LHSSEYKMVTLLNKGQGLSSCTVPGLAKASTYEFFLVPFYKSIVGKP 679 

Qy 755 SNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPP-PSQHHNGNLYGYKIEV--SAG 810 

III: I III! Ill :: :hh :::| II h: II I I : I 
Db 680 SNSRRMRTLEDVPEAPPYGMEAIQFNRTSVFLKWLPPQPNRTRNGILTSYNWVKGLDVH 739 

Qy 811 NTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISLFMDPTHHVHP 87C 

II :: II:: ::|| Mil I : : : 1 : 1 Ihlll I : 
Db 740 NTTRIFRNMTiDAAAPTLLLANLTTGVTYYIAVAAATRVGVGPFSRPAVLRI 791 

Qy 871 PRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTTHRKTTDYLSGPWLMVLVCIVLLVLV 530 

I I M I I : hh I :lh :: ::| 

Db 792 DARTQSLDTGYTR YPISRDIADDFLTQTWFIVLLGSIIAIIV 833 

Qy 931 ISAAI SMVYFKRKHQMTKE - - LGHL 953 

::| III :l h II I 
Db 834 FLLG-ALVLFRR-YQFIRQTSLGSL 856 



RESULT 4 
044928 

ID 044928 

AC 044928; 

DT 01-JUN-1998 (TrEMBLrel. 06, Created) 

DT 01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 

DE SAX- 3. 

GN SAX-3. 

OS Caenorhabditis elegans. 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI TaxID-6239; 

RN [1] 

RP SEQUENCE FROM N. A, 

RX MEDLINE-98117250; PubMed-9458046; 

RA Zallen J. A., Yi B,A,, Bargmann C.I.; 

RT "The conserved immunoglobulin superfamily member SAX-3/Robo directs 

RT multiple aspects of axon guidance in C. elegans."; 

RL Cell 92:217-227(1998). 

DR EMBL; AF041053; 1 AAC38848.1; -. 

DR HSSP; P56276; ITLK. 

DR INTERPRO; IPR001777; 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

DR PRINTS; PR00014; FNTYPEIII. 

SQ SEQUENCE 1273 'AA; 139427 MW; 013E766B51A7BAD7 CRC64; 



PRELIMINARY; PRT; 1273 AA. 



Query Match 21,7%; Score 1609,5; DB 5; Length 1273; 

Best Local Similarity 31.6%; Pred. No. 7.8e-100; 

Matches 422; Conservative 195; Mismatches 526; Indels 191; Gaps 

Qy 55 SPRIIEHPTDLWKKNEPATLNCKVEGKPEPT-IEWFRDGEPVSTNEKK--SHRVQFKDG 111 

:| lllll Ml : Mil II I MM M: III: I 
Db 30 APVIIEHPIDVWSRGSPATLNC--GAKPSTAKITWYRDGQPVITNKEQVNSHRIVLDTG 87 

Qy 112 ALFFYRTMQGK--KEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKG 169 

M : M |:|| hill I |: I ll::hll:llll |: : I 
Db 88 SLFLLRVNSGKNGRDSDAGAYYCVASNEHGEVRSNEGSLRLAMLREDFRVRPRTVQALGG 147 
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Qy 170 ETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEG 229 

I hill 11:1 IIIUII I : I : |||:| |: | 
Db # 148 EMAVLECSPPRGFPEPWSWRRDD KELRIQDMPRYTLHSDGNLIIDPVDRSDSG 201 

Qy 230 NYKCIAQNLVGHESSYMLIVQVKPyFMKEPKDQVMLYGOTATFHCSVGGDPPPKVLWK 289 

hhl 1:11 I |: hi I II I :MII : I I I I III h: II 
Db 202 IYQCVANNMVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWK 261 

Qy 290 KEEGNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTK 349 

:ll:ll I I : I I : hill III I I I : I I II MM 
Db 262 RKNEPMPVTRAYIARDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQT 321 

Qy 350 RPSNKKVGLNGWQLPC^SGNPPPSVMKEGVSTLMFPN-SSHGKQYVAADGTLQIT 407 

:h:: II I I I h Ihlll hlh h I h III I 
Db 322 RPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTRVSPTGTLTIE 381 

Qy 408 DVRQEDEGYYVCSAFSWDSSTVRVPLGVS SVDERPPPI IQIGPANQTLPKGSV 461 

»:IM III III: : II : hh : :lll h: I Mil II 
382 EVRQVDEGAYVCAGMNSAGSSLSKAALKVTTKAVTGNTPAKPPPTIEHGHQNQTLMVGSS 441 

Qy 462 ATLPCRATGNPSPRIKWFHDGHAVQ-AGNRYSIIQGSSLRVDDLQLSfGTYTCIASGER 520 

I Mhhl hi Mil: :| I II : lh 1 1 III I I 

Db 442 AILPCQASGRPTPGISWLRDGLPIDITDSRISQHSTGSLHIADLRRP±GVYTCIARNED 501 

Qy 521 GETSWAATLTVEKPGSTS - LHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAV 579 

lh:hhllll I : I III :|: I I ::||: I : } I : 
Db 502 GESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTEVELHW--NAPSTSGA 559 

I 

Qy 580 GPIIGYTVEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGL 639 

Ml II -hill! I I h I II I lhl::|i||| :|| I 
Db 560 GPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGLKPSHSYMFVIRAENEKGIGTPSVS 619 

Qy 640 SNVIRTIEADFDAASAN— -DLSAARTLLIGKS-VELIDASAINASAVRLEWMLHVSAD 694 

I " I : I :: h: I II : ::| : ll::HII I 
Db 620 SALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEEVKTINSTAVRLFWKKRKL*- 677 

Qy 695 EKYVEGLRIHYR-DASVPSAQYHSITVMDASAESFWGSLRKYTKYEFFLTPF— FETI 750 

I: ::l I :: II : I I |::ll II :| Ml: |: :l 

Db 678 EELIDGYYIKWRGPPRTNDNQY-'VNVTSPSTENYWSNLMPFTNYEFFVIPYHSGVHSI 735 

Qy 751 EGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYRIEVSAG 810 

I MM II I II M::::l III :l I !l ! I:: I I 
Db 736 HGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWKAPKADGINGILKGFQI-VIVG 794 

Qy 811 NTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPJSLFMD-PTHHVH 869 
^ hi I II I :| II I :h : : I I 1:1:11 

m 795 QAPNNNRNITTNERAASVTLFHLVTGMTYKIRVAARSNGGVGVSHGTSEVIMNQDTLEKH 854 

Ty 870 PPRAHPSGTHDGRHEGQDLTYH-NNGNIPPGDINPTTHKRTTDYLSGPWLMVLVCIVLL 927 
: I : I I ::| hi : :| 

Db 855 LA AQQENESFLYGLINRSHVP VIVIVAIL 883 

Qy 928 VLVISAAISMVYFRRRHQMTKELGHLSWSDNEITALNINSKESLW IDHHRGWRT 982 

:: : h h: : ::| : : | || : : : | 
Db 884 IIFWniAYCYWRNSRNSDGKDRSFIKINDGSVHMASNN — LWDVAQNPNQNPMYNT 939 

Qy 983 ADTDRDSGLSESRLLSHVNSSQSNYNNSD--GGT DYAEV— DIRNLTTF 1027 

I : : I I ::| :|| II lh: ::|| 

Db 940 AGROTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTGGPGNAMSTF 999 

Qy 1028 YNCRRSPDNPTPYATTMIIGTSSSETCTRTTSISADRDSGTHSPYSDAFAGQVPAVPWR 1087 

I : hhlllll :: : I II : 

Db 1000 YG-NQYHDDPSPYATTTLV LSNQQPA--WLN 1027 

Qy 1088 SNYLQYPVEPINWSEFLPPPPEHPPPSSTYGYAQGSPESSRKSSKSAGSGISTNQSILNA 1147 

hill I III lh : I II I I I I II 
Db 1028 DKMLRAPAMPTN PVPPE - - PPARYADHTAG • • RRSRSSRASDGRG TLNG 1072 

Qy 1148 SIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQNRYQITPTNQHPPQL 1203 

:| :|l : II I, : II : : I : II 

Db 1073 GLHHRTSGSQRS DSPPHTDVSrVQLHSSDGTGSSRERIGERRTPPNKTLMD 1123 

Qy 1204 --PAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPM 1260 



I III I : Ihl 

Db 1124 FIPPPPSNPPPPGGHVYDDIFQTATRRQ- • 



■ 1165 



Qy 1261 ( 

I :l : I I: III I 
Db 1166 VSDGAFARVDVNA— RPTSRNRNL-- 

Qy 1321 SARQRGGHHRRRAP 1334 

h: I : I 
Db 1213 SSEADGENSEGDVP 1226 



I I : I ::: I 



RESULT 5 
089026 
ID 089026 
AC 
DT 
DT 
DT 



PRELIMINARY; PRT; 1612 AA. 



01-NOV-1998 (TrEMBLrel. 08, Created) 
01-NOV-1998 (TrEMBLrel, 08, Last sequence update) 
01-OCT-200P (TrEMBLrel. 15, Last annotation update) 
DOTT1 PROTEIN. ': 
ROBOl OR DUTTl. ' 

Mus musculo (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
NCBIJaxID-10090; 
HI 

SEQUENCE FROM N.A. 
TISSUE-BRAIN; ■ 

Wu M.C., Lowe N.„ Fordham R., Babbitts P.; 

"The mouse homologue of human DUTTl/H-robol gene: protein sequence and 

chromosomal location."; 

Submitted (JUL-1998) to the EMBL/GenBanlc/DDBJ databases. 

EMBL; Y17793; CAA76850.1; -. 

HSSP; P56276; 1TLK. 

MGD; MGI: 1274781; Robol. 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003006; -. 

PFAM; PF00041; fn3; 3. 

PFAK; PF00047; ig; 5. 

1612 AA; 176406 MW; 5F2988C544796B4B CRC64; 



Query Match 21.6%; Score 1607.5; DB 11; Length 1612; 

Best Local Similarity 30.9%; Pred. No. 1.5e-99; 

Matches 425; Conservative 189; Mismatches 485; Indels 277; Gaps 

Qy 56 PRIIEHPTDLVVKKNEPATLNCRVEGRPEPTIEWFRDGEPVST--NERRSHRVQFKDGAL 113 

111:111:11:1 I MINIM Ihl llllhl II I I :: :llh hi 
Db 29 PRIVEHPSDLIVSKGEPATLNCRAEGRPTPTIEWYRGGERVETDKDDPRSHRMLLPSGSL 85 

Qy 114 FFYRTMQGRREQ-DGGEYWCVARNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVARGETA 172 

II I : hi' : II I Hi: : I : I M :|||::hMIIII II II II I 
Db 89 FFLRIVHGRKSRPDEGVYICVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 148 

Qy 173 LLECGPPRGIPEPTLIWIRDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYR 232 

::M Ihl Mlh I III Mil h I II hh I I I 

Db 149 VMECQPPRGHPEPT ISWRRDGSPLDD KDERITI - RGGKLMITYTRKSDAGKYV 200 

Qy 233 CIAQNLVGTRESSYAKLIVQVKPYFMREPRDQVMLYGQTATFHCSVGGDPPPKVLWKKEE 292 

h hll Ml hi I :| hi I : : MM MM I hh: 
Db 201 CVGTNMVGERESEVAELTVLERPSFVRRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDD 260 

Qy 293 GNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPS 352 

I : || I- I: MM M I hi I II lh I hi I Ihl I 
Db 251 GELPKSRYEI-RDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPR 319 

Qy 353 NRKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMF- ■ -PNSSHGRQYVAADGTLQITDV 409 
' :: II I'; I hill ! : : I E Ml hi I I I h II Ihl 

Db 320 DQWALGRTVIFQCEATGNPQPAIFPREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNV 379 

Qy 410 RQEDEGYYVCSAFSWDSSTVRVFLQVSSV-DERPPPIIQIGPANQTLPRGSVATLPCRA 468 

:: I llhl M I : :hh I Mllhh II III: I I I 
Db 380 QRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTLILSCVA 439 
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469 TGNPSPRIKWFHDGHAVQA-GNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAA 527 

||:|:| M II I :| :: M '\ hi MM II 
440 TGSPAPTILWRKDGVLVSTQDSRIKQLESGVLQIRYAKLGDTGRYTCTASTPSGEATWSA 499 

528 TLTVEKPG-STSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYT 585 

: |:: I I ||: |: I hi :||: ■■■\ I I : I 
500 YIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSKNTVTLSW- - -QPNLNSGATPTSYI 556 

587 VEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTI 646 

:| II : I II I I II I hlllll I III II :h :ll 
557 IEAFSHASGSSWQTAAENVKTETFAIKGLRPNAIYLFLVRAANAYGISDPSQISDPVRTQ 616 

647 EADFDAASANDLSAARTLLTGRSV-ELIDASAINASAVRLEWMLHVSADEKYVEGLRIHY 705 

: : : Mill:: :;:|:| : I I :|::| :| I 
617 DVPPTSQGVDHKQVQREL—GNWLHLHNPTILSSSSVEVHWT-'VDQQSQYIOGYKILY 672 

» 

706 R - * DASVPS AQY#S ITVMDASAESFWGNLRKYTKYEFFLTPFFET IEGQPSNSKTALTY 763 

: II :::' I : I h :hl II III -\ I III 
673 RPSGASHGESEWLVFEVRTPTKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTL 732 

764 EDVPSAPPDNIQIGMY—NQTAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTL 821 

h Mill : I II I I III II : lh III I h 
733 EEAPSAPPRSVTVSKNDGNGTAILVTWQPPPEDTQNGM7QEYKV-WCLGNETKYHINKIV 791 

822 NATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSRPISLFMDPTHHVHPPRAHPSGTHDG 881 

: :| lh: :l I 111 : M III hi : :l 
792 DGSTFSWIPSLVPGIRYSVEVAASTGAGPGVKSEPQFIQLD 833 

882 RHEGQDLTYHNNGN-IPPGDINPTTHKKTTDYLSGP WLMVLVCIVLLVLV 930 

::|| : I I : :: :| : I |::::| : I 

834 SHGNPVSPED-QVSLAQQISDWRQPAFIAGIGMCWIILMVFSIWL-- 879 

931 ISAAISMVYFRRKHQ MTKELGHLSWSDNEITALNINSK 969 

I II': : : .| :| I III: 

880 YRHRKKRNGLTSTYAGIRRVPSFTFTPTVTYQRGGEAVSSGGRPGLLNISEP 931 

970 ESL-KI DHHRGWRTADTDKDSGLSESKLLSHVNSSQ ■ ■ SNYNNS 1010 

: |: ::l :| hi I :: = =1111 

932 ATQPWI^TWPNTGNNHNDCSINCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNKQTNL 991 



1011 -DGGTDYAEVDTRNLTTFYNCRRSPD- 

I Mil I lh 



—NPTPYATTMII 1046 

MLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQANLSNNMNNG 1051 

1047 -GTSS SETCTKTTSISADKDSGTHSPY 1072 

I || ' ::| I : I I I 

1052 AGDSSEKIWKPPGQQKPEVAPIQYNIMEQNKLNRDYRANDTIPPTIPYNQSYDQNTGGSY 1111 



K1073 SDAFAGQVPAVPWKSNYLQYPVEP INWSEFLPPPPEHPPPSSTYGYAQGSPESSR 1128 
: : I : Ml :||:: Mil IM I 
1112 NSSDRGSSTSGSQGHKKGARTPKAPKQGGMNWADLLPPPPAHPPPHS 1158 

Qy 1129 RSSKSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGTQ 1188 

I I h I : II :| I 

Db 1159 NSEEYNMSVDES YDQEMPCPVPPAPMYLQ Q 1188 

Qy 1189 NRYQITPTNQHPPQLPAYFATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRAC 1248 

: I :: II M I II :| II 
Db 1189 DELQ-EEEDERGPTPPVRGAASSP-AAVSYSHQSTAT 1223 

Qy 1249 NSCDALATPSP—MQP PPPVPVPEGWYQPVHPNSHPMHPTSSNH 1290 

Ml :|l II h III I I II I 

Db 1224 LT PSPQEELQPMLQDCPEDLGHMPHP - • ■ PDRRRQPVSP ■ PPPPRPISPPH 1270 



RESULT 6 
Q9Y6N7 

ID Q9Y6N7 PRELIMINARY; PRT; 1651 AA. 
AC Q9Y6N7; 

DT 01-NOV-1999 (TrEMBLrel. 12, Created) 

DT 01-NOV-1999 (TrEMBLrel. 12, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 



1. 



OS Homo sapiens (Human). 

OC Eukaryota; Metaaoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX HEDLISH8117249; PubMed=9458045; 

RA Kidd T., Brose t., Mitchell K.J,, Fetter R,D., Tessier-Lavigne M, , 

RA Goodman C.S., Tear G.; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily of evolutionarily conserved guidance receptors."; 

RL Cell 92:205-215(1998) . 

DR EMBL; AF040990; :AAC39575.1; -. 

DR HSSP; P56276; 1TLK. 

DR INTERPRO? IPR001777; ■. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; .ig; 5. 

SQ SEQUENCE 1651 :AA; 180928 MW; 9D98CD7CAB73074D CRC64; 



Query Match . 21.4%; Score 1592; DB 4; Length 1651; 

Best Local Similarity 30.2%; Pred. No. l,7e-98; 

Matches 419; Conservative 188; Mismatches 492; Indels 290; Gaps 39; 

Qy 56 PRIIEHPTDLWKRNEPATLNCRVEGRPEPTIEWFRDGEPVST--NEKKSHRVQFRDGAL 113 

Hhllhlhl I llllllll 11:1 MM III I :: Hlh hi 
Db 68 PRIVEHPSDLIVSKGEPATLNCRAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 127 

Qy 114 FFYRTMQGKKEQ-DGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETA 172 

II I : hi': I I I III: Mil MhMlllill M Mill 
Db 128 FFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 187 

Qy 173 LLECGPPKGIPEPTLIWIKDGVPLDDLRAMSFGASSRVRIVDGGNLLISNVEPIDEGNYK 232 

::|l I: Ml: I III Ml h I II IM I I I 

Db 188 VMECQPPRGHPEPTISWKKDGSPLDD KDERITI - RGGKLMITYTRKSDAGKYV 239 

Qy 233 CIAQNLVGTRESSYAKLIVQVKPYFMREPRDQVMLYGQTATFHCSVGGDPPPKVLWKKEE 292 

I: hll III hi I :| 1:1 I : : :| II III I I hi" 
Db 240 CVGTNMVGERESEVAELTVLERPSFVRRPSNLAVTVDDSAEFRCEARGDPVPTVRWRKDD 299 

Qy 293 GNIPVSRARILHDERSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPS 352 

Ml II I: :hl :| I hi I M lh I hi I 1 1 : 1 = 1 
Db 300 GELPKSRYEI-RDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPR 358 

Qy 353 NKKVGLNGWQLPCMASGNPPPSVFWTREGVSTLMF— PNSSHGRQYVAADGTLQITDV 409 

:: | I I' I hill Mil :|| hi I II h M Ml 
Db 359 DQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNV 418 

Qy 410 RQEDEGYYVCSAFSWDSSTVRVFLQVSSV-DERPPPIIQIGPANQTLPKGSVATLPCRA 468 

:: | I!:! :| I : :|:h I : 1 1 M : I : II IM I I I 
Db 419 QRSDVGYYICQTLNVAGSIITRAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVA 478 

Qy 469 TGNPSPRIKWFHDGHAVQA-GNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAA 527 

1 1 : 1 I I h II I :| :: M :l hi M II II Ml 
Db 479 TGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSA 538 

Qy 528 TLTVEKPG-STSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYT 586 

: |::| I II: h I hi Ml :M I I : I 
Db 539 YIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSW— QPNLNSGATPTSYI 5S5 

Qy 587 VEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTI 645 

:| || : | II Ml I hlllll I III II M M 
Db 596 IEAFSHASGSSWQTVAENVKTETSAIRGLRPNAIYLFLVRAANAYGISDPSQISDPVKTQ 655 

Qy 647 EADFDAASANDLSAARTLLTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYR 706 

: : .| : | : | : : :::|:: : I I :|::| :| h 
Db 656 DV-LPTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWT--VDQQSQYIQGYKILYR 712 

Qy 707 DASVPSAQYHS ITVMDASAESFWGNLRRYTKYEFFLTPFFETIEGQPSNSKTA 760 

|| f: I : I h :hl II III HIM 
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Db 713 — PSGANHGESDWLVFEVRTPAKNS WI PDLRKGVNYEI KARPFFNEFQGADSE IKFA 768 

Qy 761 LTYEDVPSAPPDNIQIGMY--NQTAGWVRWTPPPSQHHNGNLYGYRIEVSAGNTMKVIAN 818 

I I: Nil! : : I I! I I III II : II: II : I 
Db 769 KTLEEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKV-WCLGNETRYHIN 827 

Qy 819 MTLNATTTSVLLNNLTTGAYYSVRLNSFTKAGDGPYSKPISLFMDPTHHVHPPRAHPSGT 878 

I- :| II- I I III : : I II I hi : :l II 
Db 828 KTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLD AH 874 

Qy 879 HDGRHEGQDLTYHNNGN - 1 PPGDINPTTHKKTTDYLSGP WLMVLVCIVLL 927 

II : I I : :: :| : I |::::| : I 

Db 875 GNPVSPED-QVSLAOQISDWKOPAFIAGIGAACWIILMVFSIWL 918 

Qy 928 VLVI SAAI SMVYFKRKHQ MTKELGHLSWSDNEITALNI 966 

Ml: :!:!:! Ill 

Db 919 YRHRKKRNGLTSTYAG I RKVPSFTFTPTVT yQRGGEAVS SGGRPGLLNI 967 

• 967 NSKESL-WI DHHRGWRTADTDKDSGLSESKLLSHVNSSQ - - SNYNNS 1010 
: = I: ::| : :| |:| I :: : :|||l 
Db 968 SEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADC'IANYNNQLDNKQ 1027 

Qy 1011 DGGTDYAEVDTRNLTTFYNCRKSPD NPTPYA§MIIGTSSSETC 1054 

I I :ll I II: llllll :| I 

Db 1028 TNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATE QLIQSNLSNNM 1087 

Qy 1055 TKTTSISADK DSGTH 1069 

: I :| II 
Db 1088 NNGSGDSGERHWKPLGQQKQEVAPVQYNIVEQNRLNKDYRANDTVPPTIPYNQSYDQNTG 1147 

1070 SPYSDAFAGQVPAVPWRSNYLQYPVEP- • - ■ INWSEFLPPPPEHPPPSSTYGYAQGSPE 1125 

I: : I : : II :ll:: I ! 1 1 1 MM I 
1148 G SY NS SDRGS ST SGSQGHKKG ART PKVPKQGGMNWADLLPP PP AHPP PHS 1197 

f" 

1126 SSRKSSKSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVY 1176 

'111:1 : 11:1 

1198 7j-NSEEYNISVDES YDQEMPCPVPPARMYLQQDELEEE 1233 

1177 [sNPLSAVAGGTQNKYQITPTNQH - • PPQLPAYFATTGPGGAVPPNH 1220 

I:! :lh I: :||: I I I II I 
1234 EDERGPTPPVRGA$SSP-AAVSYSHQSTATLTPSPQEELQPMLQDCPEETG H 1284 

1221 LPFATQRHAASEYQAGtiNAARCAQSRACNSCDALATPSPMQPPPPVPVPEGWYQPVHPNS 1280 

: I " I: llll II II 

1285 MQHQPDRRR ;; QPVSPPPP-PRP— ISPPHTYG 1312 

1281 HPMHPTSSN 1289 
: I I: ■ 
1313 YISGPLVSD 132L* 

I ± 



f 



RA 



PRELIMINARY; PRT; 1651 AA, 



RESULT 7 
055005 
ID 055005 
055005; 

01-JON-1998 (TrEMBLrel. 06, Created) 
01-JON-1998 (TrEMBLrel. 06, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
TRANSMEMBRANE RECEPTOR ROBOl. 
Rattus norvegicus (Rat), 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 
NCBIJaxID-10116; 
[1] 

SEQUENCE FROM N.A. 
TISSOE=SPINAL CORD; 
MEDLINE-98117249; PubMed-9458045; 

Kidd T. , Brose K. , Mitchell K.J., Fetter R.D.,. Tessier-Lavigne M,, 
Goodman C.S., Tear G.; 
RT "Roundabout controls axon crossing of the CNS midline and defines a 
RT novel subfamily of evolutionarily conserved guidance receptors,"; 
RL Cell 92:205-215(1998), 
DR EMBL; AF041082; AAC39960.1; -. 



HSSP; P56276; 1TLK, 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 3. 
PFAM; PF00047; ig; 5, 
Transmembrane, : 

1651 AA; 180746 MW; 



FA2452DD46E186B7 C 



Query Match ;, 21.3*; Score 1585; DB 11; Length 1651; 

Best Local Similarity 29.0*; Pred. No. 5.1e-98; 

Matches 456; Conservative 207; Mismatches 538; Indels 374; Gaps 

Qy 44 NGLPA '-. VRGQYQSPRIIEHPTDLWKRNEPAMCRVEGRPEPTI 87 

II II : :l : 111:111:11:1 I llllllll M : i III 

Db 40 NGTPAPTSDNDDNSLGYTGSRLRQEDFPPRIVEHPSDLIVSRGEPATLNCKAEGRPTPTI 35 

Qy 88 EWFKDGEPVST • • NEKKSHRVQFKDGALFF YRTMQGKKEQ - DGGEYWCVAKNRVGQAVSR 14 i 

M:l II I. I' :: :|||: 1 : 1 1 1 | : |:| : | | | |||:| :|:||| 
Db 100 EWYKGGERVETDRDDPRSHRMLLPSGSLFFLRIVHGRKSRPDEGVYICVARNYLGEAVSH 159 

Qy 145 HASLQIAVLRDDFRVEPKDTRVAKGETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSF 204 

:|||::|:||.llll II llll l::|l 1 1 : 1 llll: I III llll 
Db 160 NASLEVAILRDDFRQNPSDVMVAVGEPAVMECQPPRGHPEPTISWKRDGSPLDD 213 

Qy 205 GASSRVRIVDGGNLLISNVEPIDEGNYKCIAQNLVGTRESSYAKLIVQVKPYFMKEPKDQ 264 

I: I II hi: Mil: Ml III I : I :l hi I : 
Db 214 -KDERITI-R3GKLMITYTRKSDAGKYVCVGTNMVGERESKVADVTVLERPSFVKRPSNL 271 

Qy 265 VMLYGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDEKSLEISNITPTDEGTYVC 324 

: :l LI III I hl::l :l II I I: :hl :l I hi I 
Db 272 AVTVDDSAEFKCEARGDPVPTFGWRKDDGELPKSRYEI ■ RDDHTLKIRKVTAGDMGSYTC 330 

Qy 325 EAHNNVGQISARASLIVHAPPNFTKRPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVS 384 

I I II: I' hi I Ihl :| :: I I I I hill |::|l :ll 
Db 331 VAENMVGKAEASATLTVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQ 390 

Qy 385 TLMF- - -PNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQVSSV-D 440 

hi lllhll :hh: I IM :| I : :hh I 
Db 391 NLLFSYQPPQSSSRFSVSQTGDLTVTPQRSDVGYYICQTLNVAGSIITRAYLEVTDVIA 450 

Qy 441 ERPPPI IQIGPANQT LPKGSVATLPCRATGNPSPRI KWFHDGHAVQA- GNRYS I IQGSSL 499 

: 1 1 1 1 : 1 : II. Ilh II I llhl llllll :| :: I 
Db 451 DRPPPVIRQGPVNQTVAVDGTLTLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLESGVL 510 

Qy 500 RVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPG-STSLHRAADPSTYPAPPGTPKVL 558 

:: M hi llllll II :hl : |:: I I II: h I hi 
Db 511 QIRYAKLGDTGRYTCTASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNL1PSAPSKPEVT 570 

Qy 559 NVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQVTISGLTPG 618 

:!!: ::H hi: I :| II : I I I Ml I 
Db 571 DVSKNTVTLLW---QPNLNSGATPTSYIIEAFSHASGSSWQTVAENVRTETFAIKGLKPN 627 

Qy 619 TSYVFLVRAENTQGISVPSGLSNVIRTIEADFDAASANDLSAARTLLTGKSV-ELIDASA 677 

hlllll I III II :h :H : : Mill:: 

Db 628 AIYLFLVRAANAYGISDPSQISDPVKTQDVPPTTQGVDHRQVQREL-GNWLHLHNPTI 685 

Qy 678 INASAVRLEWMLHVSADERYVEGLRIHYK--DASVPSAQYHSITVMDASAESFWGNLRR 735 

:::|:| : l"- I :|::l :| h II ::: I : I h :|:| 
Db 686 LSSSSVEVHWT--VDQQSQYIQGYKILYRPSGASHGESEWLVFEVRTPTRNSWIPDLRR 743 

Qy 736 YTRYEFFLTPFFETIEGQPSNSRTALTYEDVPSAPPDNIQIGMY--NQTAGWVRWTPPPS 793 

II III :| I Mil: Mil :: : Ml I III 
Db 744 GVNYEIRARPFFNEFQGADSEIKFAKTLEERPSAPPRSVTVSKNDGNGTAILVTWQPPPE 803 

Qy 794 QHHNGNLYGYXIEVSAGNTMRVLAMLNATTTSVLLNNLTTGAVYSVRLNSFTRAGDGP 853 

II : li: II : I I" :l lh: I I III : : I III 
Db 804 DTQNGMVQEYXV-WCLGNETRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTGAGPGV 862 

Qy 854 YSKPISLFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGN-IPPGDINPTTHRRTTDY 912 

hi : :h ::ll Ml : :: :| 

Db 863 RSEPQFIQLD SHGNPVSPED-QVSLAQQISDV 893 
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Qy 913 LSGP- 



-WLMVLVCIVLLVLVISAAISMVYFKRKHQ 945 

: I I::::| : I III: 
Db 894 VKQPAFIAGIGAACWIILMVFSIWL YRHRKKRNGLSSTYAGIRKVPSFT 942 

Qy 946 — MTKELGHLSWSDNEITALNINSKESL-WIDHHRGW-RTADTDKD SG 990 

:| : I :l I III: : I: I I :: I :| 
Db 943 FTPTVTYQRGGEAVSSGGRPGLLNISEPATQPWLAD--TWPNTGNSHNDCSINCCTASNG 1000 

Qy 991 LSESKLLSHVNSSQ- -SNYNNS DGGTDYAEVDTRNLTTFYNCRKSPD— 1035 

hi I :: : :llll I I HI I II: 

Db 1001 NSDSKLTTYSRPADCIANYNNQLDNKQTtMLPESTVYGDVDLSilKINEMKTPNSPNLKD 1060 

Qy 1036 NPTPYATTMII GTSSSE 1052 

IIMIII :! I III 

Db 1061 GRFVNPSGQPTPYATTQLIQANLINNMNNGGGDSSEKHWKPPGQQKQEVAPIQYNIMEQN 1120 

Qy 1053 TCTRTTSISADKDSGTHSPYSDAFAGQVPAVPVVRSNYLQYPVEP 1 1098 

I I : I I I: : I : : I I : 

Db 1121 KLNKDYRANDTILPTIPYNHSYDQNTGGSYNSSDRGSSTSGSQGHRRGARTPRAPRQGGM 1180 

Jy 1099 NWSEFLPPPPEHPPPSST YGYAQGSP ESSRKSSK 1132 

m lh: Mill llll I ■ III I : : 

^ 1181 NWADLLPPPPAHPPPHSNSEEYSMSVDESYDQEMPCPVPPARMYLQQDELEEEEAERGPT 1240 

Qy 1133 SAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVA CPPENVYSNPLSAVAGGT 1187 

I :::::: I I I: : III : II : I : 
Db 1241 PPVRGAASSPAAVSYS - HQST — ATLTPS PQEELQPMLQDCPED LGHMPHPP 1289 

Qy 1188 QNRYQITPTNQHPPQLP AYFATTGP 1212 

111:111 I :|| 
Db 1290 DRRRQ--PVSPPPPPRPISPPHTYGYISGPLVSDMDTDAPEEEEDEADMEVARMQTRRLL 1347 

Qy 1213 GGAVPPNHL PFATQR 1227 

I I ::: II 
Db 1348 LRGLEQTPASSVGDLESSVTGSMINGWGSASEEDNISSGRSSVSSSDGSFFTDADFAQAV 1407 

Qy 1228 HAASEYQAGLNAARCAQSRACNSCDALATPSPMQPPPPVPVPEGWYQPVHPNSHPMHPTS 1287 

11:11 III II I hi :| II I 

Db 1408 AAAAEY-AGLKVARRQMQDAAGRRHFHASQCP-RPTSPV STD 1447 



Qy 1288 £ 

II h :| :: II 
Db 1448 SN MSAAVIQRARPTRKQKHQ- 

Qy 1340 MESENENMLAEYEQR 1354 

::| : h I I 
Db 1490 IKSPSVQSKAQLEAR 1504 



•■••PWQPC 1339 

II III II I 

■ PGHLRREAYTDDLPPPPVPPPA 1489 



f 



|£SDLT 8 
IQZI3 
Q9QZI3 



PRT; 1060 AA. 



01-MAY-2000 (TrEMBLrel. 13, Created) 
01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
REPULSIVE GDIDANCE RECEPTOR (FRAGMENT) , 

Rattus norvegicus (Rat), 

Eukaryota; Metazoe; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 
NCBI TaxID-10116; 
[1] 

SEQUENCE FROM N.A. 
MEDLINE-99200391 ; PubMed-10102268 ; 

Brose K., Bland K.S., Wang K.H., Arnott D. , Henzel W., Goodman C.S., 
Tessier-Lavigne M., Kidd T.; 

"Slit proteins bind Robo receptors and have an evolutionarily 

conserved role in repulsive axon guidance,"; 

Cell 96:795-806(1999), 

EMBL; AF182037; AAF04558.1; -. 

HSSP; P56276; 1TLK. 

INTERPRO; IPR001547; -. 

INTERPRO; IPR001777; -. 



DR INTERPRO; IPR0Q3Q06; -, 

DR PFAM; PF00041; fn3; 3, 

DR PFAM; PF00047; ig; 5. 

DR PRINTS; PR00014; FNTYPEIII, 

DR PROSITE; PS00659; GLYCOSYLJYDROLJ5; [JNKNOWNJ. 

RW Receptor. 

FT NONJER 1060 • 1060 

SQ SEQUENCE 1Q6Q : AA; 116790 MW; C4BC8C11E8542DA4 CRC64; 



Query Match 1 20.24; Score 1499; DB 11; Length 1060; 

Best Local Similarity 34.7%; Pred. No. 1.7e-92; 

Matches 361; Conservative 175; Mismatches 398; Indels 106; Gaps 28; 

Qy 56 PRIIEHPTDLWKKNEPATLNCKVEGKPEPTI EWFKDG - EPVSTNEKKSHRVQFKD 110 

h :! h:::l I :| I I I :hl III I I : : I : 
Db 31 PKXVEQPSEVIVSKGRPNTPNWKQKGRPFPTIGRVQRMVKPGWDKTKDDSKVTQGCLLPS 90 

Qy 111 GALFFYRTMQGKKEQ - DGGEYWCVAKNRVGQAVS RHASLQ I AVLRDDFRVEPKDTRVAKG 169 

hill I : h: : I I I llhl : h 1 1 1 h 1 1 1 : : 1 : 1 1 1 1 1 1 I I II I 
Db 91 GSLFFLRIVHGRRSKPDEGTYVCVARNYLGEAVSRNASLEVALLRDDFRQNPTDVWAAG 150 

Qy 170 ETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEG 229 

I hill Ihl lllh I II I :h h I II hill I I 

Db 151 EPAILECQPPRGHPEPTIYWKKDKVRIDE KEERISI-RGGKLMISNTRKSDAG 2?2 

Qy 230 NYKCIAQNLVGTRESSYAKLIVQVKPYFMKEPRDQVMLYGQTATFHCSVGGDPPPKVLWK 269 

I h hi! hi hi I :l h: I :lhl : I I I I III I I II 
Db 203 MYTCVGTNMVGERDSDPAELTVFERPTFLRRPINQWLEDEPAEFRCQVQGDPQPTVRWK 262 

Qy 290 KEEGNIPVSRARILHDE KSLEISNIT PTDEGT Y VCEAHNNVGQ I SARASLIVHAPPNFTK 349 

h: ::| I I h :l I IIMIII I I lh: I hi I III I 
Db 263 KDDADLPRGRYDI - RDDYTLRIKKAISADEGTYVCIAENRVGRVEASATLTVRAPPQFW 321 

Qy 350 RPSNKKVGLNGWQLPCMASGNPPPSVFWTREGVSTLMFPN- - -SSHGRQYVAADGTLQI 406 

II :: I I II III hill III hill : I h III 

Db 322 RPRDQIVAQGRTVTFPCETRGNPQPAVFWQREGSQNLLFPNQPQQPNSRCSVSPTGDLTI 381 

Qy 407 TDVKQEDEGYYVCSAFSWDSSTVRVFLQVSSV-DERPPPIIQIGPANQTLPKGSVATLP 465 

I:::: I llhl Ml I : |:|: I :|||||| II Mil I I 
Db 382 TNIQRSDAGYYICQALTVAGSILARAQLEVTDVLTDRPPPIILQGPINQTLAVDGTALLR 441 

Qy 466 CRATGNPSPRIRWFHDGHAVQAGNRYSIIQG-SSLRVDDLQLSDSGTYTCTASGERGETS 524 

hill Mil: : : II :|:: :|::lhlllll h llll 
Db 442 CRATG-PLPVISWLREGFTFLGRDPRATIQDQGTLQIRNLRISDTGTYTCVATSSSGETS 500 

Qy 525 WAATLTVEKPGSTSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWARSQEKPGAVGPIIG 584 

hi I I : l.:l : : I : I II hi :h: h:l I II : I 
Db 501 WSAVLDVTESGAT-ISKNYDTNDLPGPPSRPQVTDVTRNSVTLSWQTG--TPGVL-PASA 556 

Qy 585 YTVEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIR 644 

I :| II : I I: Mill I Ihl II M :: 

Db 557 YIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINPQGLSDPSPMSDPVR 616 
i 

Qy 645 TIEADFDAAS/iNDLSAARTLLTGRSVELIDASAINASAVRLEWMLHVSADERYVEGLRIH 704 

I : Ml : | :| | : : : |;M I ::::| h 
Db 617 TQDIS-PPAQGVDHRQVQKELGDVTVRLHNPWLTPTTVQVTWT--VDRQPQFIQGYRVM 673 

Qy 705 YRDAS-VPSAQYHSITVMDASAESFWGNLRRYTKYEFFLTPFFETIEGQPSNSRTALT 762 

h I IM ;: : | |; Ml II : hi :| I III I 
Db 674 YRQTSGLQASTVWQNLDARVPTERSAVLVNLRRGVTYEIRVRPYFNEFQGMDSESRTIRT 733 

Qy 763 YEDVPSAPPDNI - ■ -QIGMYNQTAGWVRWTPPPSQHHNGNLYG YRIEVSAGNTMKVLANM 819 

MINI:: : I : I I : I I 1 1 1 : I 1 1 : III II : I 
Db 734 TEEAPSAPPQSVTVLTVGSHNSTSISVSWDPPPADHQNGIIQEYKI-WCLGNETRFHINK 792 

Qy 820 TLNATTTSVLLNNLTTGAVYSVRLNSFTRAGDGPYSRPISLFMDPTHHVHPPRAHPSGTH 879 

h:H lh:' I I II : : I II I hi : : 
Db 793 TVDATIRSW IGGLFPG IQYRVEVAASTSAGVGVKSEPQPI II 835 

Qy 880 DGRHEGQDL1VHNNGNIPPGDINPTTHKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVY 939 

Ihl :|':H : :: || : | : ; |::: :| :| 

Db 836 GGRNE-WITENNN SITEQITDWRQPAFIAGIGGACWVILMGFSI-WLY 883 
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Qy 940 FKRK HQMTKELGHLSVVSDNEITA-LNINSKESLWIDHHRGWRTADTDKDSGL 991 

"II : :| : I ::|: II |: || 
Db 884 WRRKKRKGLSNYAVTFQRGDGGLMSNGSRPGLLNTGDPNYPWL ADSWP 931 

Qy 992 SESKLLSHVNSSQSNYNN SDGGTDYAEVDTRNLTTFYNC 1030 

: I ::: II : I III |: :| || || 

Db 932 ATSLPVNNSNSGPNEI6NFGRGDVLPPVP6QGDKTATMLSDGAI -YSSIDFTTKTT -t NS 989 

Qy 1031 RKSPDNPTPYATTMIIGTSS 1050 

MINI I: ::| 
Db 990 SSQITQATPYATTQILHSNS 1009 



RESULT 9 
Q9VQ10 

ID Q9VQ10 PRELJJjINARY; PRT; 823 AA, 

^AC Q9VQ10; 

B 01-MAY-2000 (TrEMBLrel. 13, Created) 

01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT Ql-OCT-2000 (TrEMBLrel. 15, Last annotation update) - 

DE CG5481 PROTEIN. £ 

GN CG5481. i 

OS Drosophila melanogaster (Fruit fly) . i 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda;$insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI TaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BERKELEY; 

RX MEDLINE-20196006; PubMed-10731132; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E,, Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A., Lewis S.E., Richards s., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q. ( Chen L.X., 

RA Brandon R.C., Rogers y.-h.c, Blazej R.G., Champe M., Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G. , Belt G,, Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova d., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H, , Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

^A Dodson K., Doup L.E., Downes M., Dugan-Rocha s., Dunkov B.C., Dunn P., 

B Durbin K.J,, Evangelista C.C., Ferraz C, Ferriera S., Fleischiann W., 

H Fosler c, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., 

«A Glodek A., Gong F., Gorrell J.H., Gu I., Guan P., Harris M., 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

, RA Hostin D., Houston K.A., Howland T.J,, Wei M.-H., Ibegwam C, 

RA Jalali M., Kalush F., Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z,, 

RA Lasko P,, Lei Y., Levitsky A. A., Li J., Li Z., Liang Y., Lin X., 

RA Liu x., Mattei B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A,, 

RA Mount S.M,, Moy M. , Murphy B., Murphy L. , Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri v., Reese M.G., 

RA Reinert K., Remington K., Saunders R.D.C., Scheeler p., Shen H., 

RA Shue B.C., Siden-Kiaraos I., Simpson M., Skupski M.P., Smith T,, 

RA spier E, , Spradling A.C., Stapleton M,, strong R., Sun E., 

RA Svirskas R., Tector C, Turner R., Venter E., Wang A.H., Wang X., 

RA Wang z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S,, Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., zhong W., Zhou X,, Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003586; AAF51373.1; -. 

DR HSSP; P56276; 1TLK. 

DR FLYBASE; FBgn0031341; CG5481. 



INTERPRO; IPR001412; -. 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003Q06; -. 

PFAM; PF00041; fn3; 1. 

PFAM; PF00047; ig; 5. 

PRINTS; PR00014;; FNTYPEIII. 

PROSITE; PS00178; AAJTRNA_LIGASE_I; UNKNOWN.l. 

SEQUENCE 823 AA; 89715 MW; 36FC0B91F36F2F19 CRC64 ; 



Query Match 19.3%; Score 1430; DB 5; Length 823; 

Best Local Similarity 36.4%; Pred. No. 5.2e-88; 

Matches 301; Con'servative 141; Mismatches 249; Indels 136; Gaps 19 

Qy 64 DLWKKNEPATLNCKVEGKPEPTIEWFKDGEPVSTNEKKSHRVQFKDGALFFYRTMQGKK 123 

I I MM 'I II: II I Ihllll : I : |||: | ||| : : :: 
Db 2 DT TVPKNDPFTFNCQAEGNPTPT IQWFKDGRELKT - DTGSHRIMLPAGGLFFLKVI HSRR 60 

Qy 124 EQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETALLECGPPKGIP 183 

I I I III III I I ll:|:l!:| lll:||:ll : I ■ 1 1 : 1 1 ||:||| |:| 
Db 61 ESDAGTYWCEAKNEFGVARSRNATLQVAFLRDEFRLEPANTRVAQGEVALMECGAPRGSP 120 

Qy 184 EPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKCIAQNLVGTRE 243 

f I : I l:l ; : " : hllllllM I hi |:|: :|:||||| 
Db 121 EPQISWRKNG- QTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWKNWGTRE 174 

Qy 244 SSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWKK--EEGNIPVSRAR 301 

IMII hi: :: |::| : I : | | ;|||t I |||;: ||:| | 
Db 175 SATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNMP • RRVH 233 

Qy 302 ILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPSNKKVGLNGV 361 

= 1 I HI:: "I I I I III I II hi I Mill I II |: I : 
Db 234 VLED-RSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPKFVIRPKNQLVEIGDE 292 

Qy 362 VQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVA ADGTLQITDVRQEDEGYY 417 

I I 1:1:1 l:: = l: II hh I II I II : 1 1 I 

Db 293 VLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLT PEGRSVLS I ARFAREDSGKV 352 

Qy 418 V-CSAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRI 476 

I hi : I J : I I : I llllh II Mill |: WW I I I:: 
Db 353 VTCNALNAVG3VSSRTWSVDTQFELPPPIIEQGPVNQTLPVKSIWLPCRTLGTPVPQV 412 

Qy 477 KWFHDGHA--VQAGNRYSIIQGSSLRVDDLQL-SDSGTYTCTASGERGETSWAATLTVEK 533 

I: II II I :: :| : III I I III II |::||: I :: 
Db 413 SWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRNGKSSWSGYLRLDT 472 

Qy 534 PGSTSL-HRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFS 591 

I : :: II : Mil III |::: |::| I :| : |: ::|| :| | 
Db 473 PTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTLSWTRSNKVGGS--SLVGYVIEMFG 530 

Qy 592 PDLQTGWIVAAHRVGDTQVT ISGLT PGTSYVFLVRAENTQGI SVPSGLSN - *VIKT IEAD 649 

: M: i II :| I :|| II :| Ihllll: hhll :| : |: :: 
Db 531 KNETDGWVAVjTRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTVSSE 590 

Qy 650 FDAA SANDLSAAR-TLLTGKSVELIDASAINASAVRLE 686 

! . lh I III II :||:| III :|| :::::::| 

Db 591 NHSFTLMFPSLIHYFDSLFHPQRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKLT 650 

QY 687 WML :■— HVSA DEKYVEGLRIHYKDASVP 711 

I : I I = Hill :: : I 

Db 651 WQVCNRLTDGS I AAPHS I AHRHLI RS ASFLMQ I I NGKYVEGFY VYARQLPNPI VNNPAPV 710 

QY 712 ;■ SAQYHSIT 719 

: : :l :l 

Db 711 TSNTNPLLGSTSTSASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLT 770 

Qy 720 VMD-ASAESFWGNLKKYTKYEFFLTPFFETIEGQPSNSKTALTYED 765 

<.::: II': I Ml lllh 1 1 ::::| I :| 1 1 1 : I I II 
Db 771 ILNGGGASSCTITGLVQYTLYEFFIVPFYKSVEGKPSNSRIARTLED 817 
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Q9Z2I4 PRELIMINARY; PRT; 1344 AA. 
09Z2I4; 

01-MAY-1999 (TrEMBLrel. 10, Created) 

OI-may-1999 (TrEMBLrel. 10, Last sequence update) 

01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
RIG-1 PROTEIN, i 
RBIG1. | 

Mus musculus (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrate; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
NCBIJaxID-10090; 

[1] 

SEQUENCE FROM N.A. 

Yuan S.-S.F. , Cox L.A., Dasika G.K., Lee E.Y.-H.P.; 
Submitted (APR-1998) to the EMBL/GenBank/DDBJ databases, 
EMBL; AF060570; AAD11628.1; -. 
HSSP; P562,76; 1TLK. ^ , 

MGD; MGI: 1343102; Rbigl. * | 

INTERPRO; IPR001777; -. t 
INTERPRO; IPR003006; -. | 
PTAM; PF00041; fn3; 3. \ 
PFAM; PF00047; igf 5, • 
SEQUENCE 1344 AA; 143439 MW; 8B0060341C49CFEA CRC64; 



Query Match 18.8%; 
Best Local Similarity 29.1%; 
Matches 415; Conservative 1 



Score 1395.5; DB 11; Length 1344; 
Pred. No. 2.3e-85; 1 
10; Mismatches 560; Indels 273; Gaps 



38 LVLVASNGLP- -AVRGQYQ SPRIIEHPTDLWKRNEPATLNCKVEG 81 

II: III I: I : IIM I |||| : JIM |: I 

8 LTLQSKPGLPPVALPGYLELPSSPGSRVGPEDAMPRIVEQPPDLWSRGEPATLPCRAEG 67 

82 KPEPTIEWFKDGEPVST - ■ NEKKSHRVQFKDGALPFYRTMQGKKEQ ■ DGGEYWCVAKNRV 138 

:| I 1 1 1 : 1 : 1 hi : ::||: lllll I : |:: : I I I MIM : 
68 RPRPNIEWYKNGARVATMEDPRAHRLLLPSGALFFPRIVHGRRSRPDEGVYTCVARNYL 127 

139 GQAVSRHASLQIAVLRDDFRVEPRDTRVARGETALLECGPPRGIPEPTIIWIRDGVPLDD 198 

I I I : II II |::|| til Ml : I I : I : 

128 GAMSRNASLEVAVLRDDFRQSPGNVWAVGEPAVMECVPPRGHPEPLVTWRRGRIRLRE 187 

199 LKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKCIAQNLVGTRESSYAKLIVQVKPYFM 258 

I: I II l::|: I I I |:| |: I III hhl :l h 
188 EEGRITI-RGGKLMMSHTFKSDAGMYMCVASNMAGERESGAAELWLERPSFL 239 

259 KEPKDQVMLYGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDEKSLEISNITPTD 318 

: I :ll:l I I I III I : hl::| :| I I I HI I :: I 
240 RRPINQWLADAPVNFLCEVQGDPQPNLHWRKDDGELPAGRYEIRSDH-SLWIDQVSSED 298 

319 EGTYVCEAHNNVGQISARASLIVHAPPNFTKRPSNKKVGLNGWQLPCMASGNPPPSVFW 378 

Nil I I |:||: I II II II I :| I: I II lllll::|l 
299 EGTYTCVAENSVGRAEASGSLSVHVPPQFVTRPQNQTVAPGANVSFQCETRGNPPPAIFW 358 

379 TKEGVS1LMFPNSS - - -HGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVFLQ 435 

III |:||: I II I: I I I h I : I lllll I II I : |: 
359 QKEGSQVLLFPSQSLQPMGRLLVSPRGQLNITEVKIGDGGYYVCQAVSVAGSILAKALLE 418 

436 V--SSVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFHDGHAVQA-GNRYS 492 

: :hl II hi lllllll II IN III I |:| I ;l :::: 
419 IKGASIDGLPPIILQ-GPANQTLVLGSSVWLPCRVIGNPQPNIQWKKDERWLQGDDSQFN 477 

493 IIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVERPGSTSLHRAADPSTYPAPP 552 

:: :| : :| I I |:| I II :| : I " lllllll 
478 LMDNGTLHIASIQEMDMGFYSCVMSSIGEATWNSWLRRQEDWGASPGPATGPSNPPGPP 537 

553 GTPRVLNVSRTSISLRWARSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQVTI 612 

I I I: 11:1 I I : II Ml III II 

538 SQPIVTEVTANSITLTW-KPNPQSGATA--TSYVIEAFSQAAGNTKRTVADGVQLETYTI 594 

613 SGLTPGTSYVFLVRAENTQGISVPSGLSPIKTIEADFDAASANDLSAARTLLTGKSVEL 672 

III I I hlllll hi II : ;:l :: : | | : | :| : 
595 SGLQPNTIYLFLVRAVGAWGLSEPSPVSEPVQTQDSSL-SRPAEDPWRGQRGLAEVAVRM 653 



Qy 673 IDASAINASAVRLEWMLHVSADEKYVEGLRIHYRDASVPSAQYHSITVMDASAESFWGN 732 

: : : | | : |:| |: :: | : : : : :| |: 

Db 654 QEPTVLGPRTLQVSWT - -VDGPVQLVQGFRVSWRIAGLDQGSWTMLDLQSPHRQSTVLRG 711 

Qy 733 LKRYTRYEFFLTPFFETIEGQPSNSRTALTYEDVPSAPPDNIQI--GMYNQTAGWVRWTP 790 

I : : : : I I I hill : : I :: I I I 
Db 712 LPPGAQIQIRVQVQGQEGLGAESPFVTRSIPEEAPSGPPQGVAVALGGDRNSSVTVSWEP 771 

Qy 791 PPSQHHNGNLYGYRIEVSAGNTMRVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTRAG 850 

I II : : hi II : I : II : I I :| : : I II 
Db 772 PLPSQRNGVITEYQI-WCLGNESRFHLNRSAAGWARSVTFSGLLPGQIYRALVAAATSAG 830 

Qy 851 DGPYSRPISLFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTTHKRTT 910 

I I I: :-: II I I : = 

Db 831 VGVASAPVLVQLP FPPAAEPG PEVSEGLAERLA 863 

Qy 911 DYLSGPWLMV^ * -LVCIVLLVLVISAAISMVYFRRRHQMTRELGHLSWSDNEITALNIN 967 

II:: I II: :| I: : III I : I l = : 

Db 864 KVLRKPAFLAGSSAACGALLLGFCAA LYRRQKQRRELSHYT-ASFAYTPAVSFP 916 

Qy 968 SRESL WI DHHRGW - ■ • RTADTDKDSGLSES 994 

I I - I: II I I I :|: 

Db 917 HSEGLSGSSSRPPMGLGPAAYPWLADSWPHPPRSPSAQEPRGSCCPSNPDPD'DRYYNEA 975 

Qy 995 RLLSHVNSSQSNYNNSDGGTDYAEVDT-RNLTTFYNCRRSPDNPTPYATTMIIGTSSSE 1052 

: :: : I I I I: : III: : I : II 

Db 976 GISLYLAQTARGANASGEGPVYSTIDPVGEELQTFHGGFPQHSSGDPSTWSQYAPPEWSE 1035 

Qy 1053 TCTRTTSISADRDSGTHSPYSDAFAGQVPAVPWRSNYLQYPVE--PINWSEFLPPPPEH 1110 

: III I II I II: ::l I III 
Db 1036 -,-GDSG ARGGQ GRLLGRPVQMPSLSWPEALPP 1065 

Qy 1111 PPPSSTYGYAQGSPESSRKSSKSAGSGIS--TNQSILNASIHSSSSGGFSAWGVSPQYAV 1168 

llll :! II I I :l I : lllll : 

Db 1066 PPPSCELSCPEG-PEEELRGSSDLEEWCPPVPERSHL---VGSSSSGA CM 1111 

Qy 1169 ACPPENVYSNPLSAVAGGTQNRYQITPTNQHPPQLPAYF ATTGPGGAVPPN 1219 

I :| I: I I: :||: IIM II 
Db 1112 VAPAPRDTPSPTSSY • "GQQSTATLTPSPPDPPQPPTDIPHLHQMPRRVPLGPSS 1164 

Qy 1220 HLPFATQRHAASEYQ— AGLNAARCAQSRACNSCDALATPSPMQ 1261 

I : : I I = II I MM: 
Db 1165 • • PLSVSQPALSSHDGRP VGLGAGPVLS YH ASPSPVPSTASSAPGRTRQVTG 1214 

Qy 1262 •' PPPPVPVPE GWYQPVHPNS 1280 

MUM II MM: 
Db 1215 EMTPPLHGHRARIRRRPRALPYRREHSPGDLPPPPLPPPELRDRLALGSAGSRQHVFPRA 1274 

Qy 1281 HPMHPTSSNHQIYQCSSECSDHSRSSQSHKRQLQLEEHGSSAKQRGGH 1328 

'llll I :: III I 

Db 1275 RA ---QWGEESGAGSAS RGPTSSQRGPH 1299 



RESULT 11 
P91572 

ID P91572 PRELIMINARY; PRT; 423 AA, 

AC P91572; , 

DT 01-MAY-1997 (TrEMBLrel. 03, Created) 

DT 01-MAY-1997 (TrEMBLrel. 03, Last sequence update) 

DT 01-MAY-20Q0 (TrEMBLrel, 13, Last annotation update) 

DE SIMILAR TO THE IMMUNOGLOBULIN SUPERFAMILY, 

GN ZK377.3, 

OS Caenorhabditis elegans . 

OC Eukaryota; Metaioa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI.TaxID-6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL! N2; 

RX MEDLINE-94150718; PubMed-7906398; 

ra Wilson R. , Ainscough R., Anderson R., Baynes C, Berks M, , 

RA Bonfield J., Burton J,, Connell M., Copsey T., Cooper J,, Coulson A., 

RA Craxton M., Dear S., Du z,, Durbin R. , Favello A., Fulton L,, 
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Gardner A., Green p,, Hawkins T,, Hillier L, Jier M. , Johnston L., 
Jones M., Kershaw J., Kirsten J., Laister N., Latreille P., 
Lightning J,, Lloyd C, Mcmurray A., Mortimore B., O'Callaghan M., 
Parsons J., Percy c, Rifken L. , Roopra A., Saunders D., Shownkeen I 
Smaldon N. , Smith A., Sonnhanuner E., Staden R., Sulston J., 
Thierry-Mieg J,, Thomas K, , Vaudin M., Vaughan K., Waterston R., 
Watson A., Weinstock L., Wilkinson-Sproat J., Wohldman P.; 
"2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 
elegans."; 

Nature 368:32-38(1994). 
[2] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
Nhan M., Hawkins J.; 

Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases, 

[3] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 

Waterston Re- 
submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 
[4] 

SEQUENCE FROM N.A, 
STRAIN-BRISTOL N2; 

Waterston Re- 
submitted (APR-1997) to the EMBL/GenBank/DDBJ databases. 
EMBL; U88183; AAB52658.1; -. 
HSSP; P56276; 1TLK. 
INTERPRO; IPR003006; -. 
PFAM; PF00047; ig; 4. 

423 AA; 46544 MW; EB4530DB6BD575E5 CRC64; 



oy 



Query Match 11.1%; Score 826.5; DB 5; Length 423; 

Best Local Similarity 42.8%; Pred. No. 9.5e-48; 

Matches 167; Conservative 61; Mismatches 147; Indels 15; Gaps 6; 

55 SPRIIEHPTDLWKKNEPATLNCKVEGKPEPT-IEWFKDGEPVSTNEKK-SHRVQFKDG 111 

: Nil! |:|| : MINI || I |:|||:|| ||::: II: I 
29 APVIIEHPIDVWSRGSPATLNC ■ -GAKPSTAKITWYKDGQPVITNKEQVNSHRIVLDTG 86 

112 ALFFYRTMQGK-REQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVARG 169 

:|| : 111:11 hill I h I 1 1 : : 1 : 1 1 : 1 1 1 1 |: : I 
87 SLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGG 146 

170 ETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEG 229 

I hill 11:1 111:111 I : I : llhl h I I 
147 EMAVLECSPPRGFPEPWSWRKDD KELRIQDMPRYTLHSDGNLI IDPVDRSDSG 200 

230 NYKCIAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWR 289 

hhl 1:11 I h M I II I :|lll : I I I I III h: II 
201 TYQCVANNMVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWK 260 



Qy 290 KEEGNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTK 349 

:: :lhll I I : II : hill III I I I : I I I I llhl 
Db i 261 RKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASABLRVQAPPSFQT 320 

Qy 350 RPSNKKVGLNG^QLPCMASGNPPPSVFWTKEGVSTLMFPN-SSHGRQYVAADGTLQIT 407 

II I III: ihlll |:||: h II h III I 
Db 321 KPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIE 380 

Qy 408 DVRQEDEGYYVCSAFSWDSSTVRVFLQVS 437 

:MI III III: : II : h : 
Db 381 EVRQVDEGAYVCAGMNSAGSSLSKAALKAT 410 



RESULT 12 
001632 

ID 001632 PRELIMINARY; PRT; 874 AA. 
AC 001632; 

DT 01-JUL-1997 (TrEMBLrel. 04, Created) 

dt 01-JUL-1997 (TrEMBLrel. 04, Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel, 14, Last annotation update) 

DE CODED FOR BY C, ELEGANS CDNA CEESC12R. 



ZK377.2. 

Caenorhabditis elegans . 

Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 
Rhabditidae; Peloderinae; Caenorhabditis, 
NCBI TaxID-6239; 
tl] 

SEQUENCE FROM N.A, 

STRAIN-BRISTOL N2; 

MEDLINE-94150718; PubMed-7906398; 

Wilson R., Ainscough R. , Anderson K., Baynes C, Berks M. , 

Bonfield J,, Burton J., Connell M,, Copsey T., Cooper J., Coulson A,, 

Craxton M,, Dear S., Du Z., Durbin R, , Favello A., Fulton L. , 

Gardner A., Green P,, Hawkins T., Hillier L., Jier M., Johnston L., 

Jones M., Kershaw J., Kirsten J., Laister N,, Latreille P., 

Lightning J., Lloyd C, Mcmurray A., Mortimore B,, O'Callaghan M., 

Parsons J,, Percy C, Rifken L., Roopra A., Saunders D., Shownkeen R. , 

Smaldon N,, Smith A., Sonnhammer E. , staden R., Sulston J., 

Thierry-Mieg J.,, Thomas K., Vaudin M, , Vaughan K, , Waterston R. , 

Watson A., Weins.tock L., Wilkinson-Sproat J., Wohldman p.; 

"2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 



Nature 368:32-38(1994). 
[2] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
Nhan M., Hawkins J.; 

Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 
[3] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
Waterston R.; 1 

Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 
[4] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
Waterston Re- 
submitted (APR-1997) to the EMBL/GenBank/DDBJ databases. 
EMBL; U88183; AAB52657.1; -. 
HSSP; P56276; 1TLK, 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 3. 
PFAM; PF00047; ig; 1. 
PRINTS; PR00014; FNTYPEIII. 

SEQUENCE 874 AA; 95861 MW; BC72270818D734C9 CRC64; 



Query Match . 10.6%; Score 790; DB 5; Length 874; 

Best Local Similarity 27.3%; Pred. No, 7.9e-45; 

Matches 254; Conservative 133; Mismatches 377; Indels 166; Gaps 34; 

Qy 442 RPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFHDGHAVQ-AGNRYSIIQGSSLR 500 

:IM I: I illll II I MM: |: III: :| I II 
Db 27 KPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQHSTGSLH 86 

Qy 501 VDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTS-LHRAADPSTYPAPPGTPKVLN 559 

: lh hhlll I I lh:hhim I : I III :h I I ::| 
Db 87 IADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPIIVN 146 

Qy 560 VSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQVTISGLTPGT 619 

I: I : I h : III II ::hll!l I I h I II I 
Db 147 VTDTEVELHWr-NAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGLKPSH 204 

Qy 620 SYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASAN— -DLSAARTLLTGKS-VELID 674 

lh|::|lll':|l II I :: I : I :: I:: I II : ::| : 
Db 205 SYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEE 264 

Qy 675 ASAINASAVRLEWMLHVSADEKYVEGLRIHYK-DASVPSAQYHSITVMDASAESFWGNL 733 

l|::|||| I I: ::l I :: II : I I h:ll II 

Db 265 VKTINSTAVRLFWKKRKL- -EELIDGYYIKWRGPPRTNDNQY- -VNVTSPSTENYWSNL 320 

Qy 734 KKYTKYEFFLTPF- • -FETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTP 790 

:| I M I : ' I : :l I III! II I' II lh:::l III : I 
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Db 321 MPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWKA 380 

Qy 791 PPSQHHNGNLYGYRIEVSAGNTMKVLAMLNATTTSVLLNNLTTGAVYSVRLNSFTRAG 850 

I : III h:l II Ml I I :| I I :|: : : I 
Db 381 PKADGINGILKGFQI-VIVGQAPNNNRNITTNERAASVTLFHLVTGMTYKIRVAARSNGG 439 

Qy 851 DGPYSKPISLFMD-PTHHVHPPRAHPSGTHDGRHEGQDLTYH-NNGNIPPGDINPTTHK 907 

I : h I I : | : I" | ::| 
Db 440 VGVSHGTSEVIMNQDTLERHLA AQQENESFLYGLINRSHVP 480 

Oy 908 KTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTKELGHLSWSDNEITALNIN 967 

1:1 : :|:: : |: |:: : ::| : : I 
Db 481 VIVIVAILIIFWIIIAYCYWRNSRNSDGKDRSFIKINDGSVHMASNN 528 

Oy 968 SKESLW IDHHRGWRTADTDRDSGLSESRLLSHVNSSQSNYNNSD--GGT 1014 

II : : : II : : I I ::| :|| I II 
Db 529 — -LWDVAQNPNQNPMYNTAGRMTMNNRNGQALYSiTPNAQDFFNNCDDYSGTMHRPGS 584 



i 



1015 • • • -DYAEV- - "DTRNLTTFYNCRRSPDNPTPYATTMIIGTSSSETCTKTTSISADRDSG 1067 

II:: ::lll : 1 : 1 : 1 1 M I :: 
585 EHHYHYAQLTGGPGNAMSTFYG-NQYHDDPSPYATTTLV 622 

1068 THSPYSDAFAGQVPAVPVVRSNYLQYPVEPINWSEFLPPPPEHPPPSSTYGYAQGSPESS 1127 

: III : hill I llllh : I I 
623 LSNQQPA- -WLNDKMLRAPAMPTN PVPPE ■ * PPARYADHT AG * - RRS 663 



Qy 1128 RKSSRSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSNPLSAVAGGT 1187 

I I I I I || :| :|| : I H : II 

Db 664 RSSRASDGRG TLNGGLHHRTSGSQRS DSPPHTDVSYVQLHSSDGI 708 

Qy 1188 QNRYQITPTNQHPPQLPAY-FATTGPGGAVPP-NHL-PFATQRHAASEYQAGLNAARCAQ 1244 

: : I : II llllh Ihl II : 

Db 709 GSSRERTGERRTPPNKTLMDFIPPPPSNPPPPGGHVYDTATRRQ LNRGSTPR 760 

Qy 1245 SRACNSCDALATPSPMQPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSECSDHSR 1304 

:| ' | :| : | |: ||| | :{ | 
Db 761 EDTYDS VSDGAFARVDVNA— RPTSRNRNL— GGRPLRGKR 797 

Qy 1305 SSQSHKRQLQLEEHGSSAKQRGGHHRRRAP 1334 

I : I ::: I I:: I : I 
Db 798 DDDSQRSSLMMDDDGGSSEADGENSEGDVP 827 



PRELIMINARY; PRT; 1377 AA. 



RX 



RESULT 13 
P97603 
ID P97603 
AC P97603; 
DT 01-MAY-1997 (TrEMBLrel. 03, Created) 
dt 01-MAY-1997 (TrEMBLrel. 03, Last sequence update) 
Ol-OCT-2000 (TrEMBLrejL. 15, Last annotation update) 
NEOGENIN (FRAGMENT), i 
Rattus norvegicus (Rat) . 

Eukaryota; Metazoa; cfiordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus, 
NCBIJTaxID-10116; 

[1] 

SEQUENCE FROM N.A, 
MEDLINE-97015074; PubMed-8861902; 

Keino-Masu K., Masu M., Hinck L., Leonardo E.D., Chan S.S.Y., 
Culotti J.G., Tessier-Lavigne M. ; 

"Deleted in Colorectal Cancer (DCC) encodes a netrin receptor,"; 
Cell 87:175-185(1996), 
EMBL; U68726; AAB41100,1; -. 
HSSP; P56276; 1TLK. 



INTERPRO; IPR000531; -. 

AlTERPRO; IPR001777; -. 

INTERPRO; IPROO3Q06; -. 

PFAM; PF00041; fn3; 6. 

PFAM; PF00047; ig; 4, 

PRINTS; PR00014; FNTYPEIII . 

PROSITE; PS00430; TONB_DEPENDENT_REC_l ; UNRNOWNJ. 

NON_TER 1 1 

SEQUENCE 1377 AA; 150637 MW; E514ED8ABD1A63A9 CRC64; 



Query Match ; 10.2%; Score 759.5; DB 11; Length 1377; 

Best Local Similarity 23.1%; Pred. No. 1.8e-42; 

Matches 328; . Conservative 179; Mismatches 544; Indels 371; Gaps 55; 

Qy 157 FRVEPKDTRVAKGETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGG 216 

I III II |:| : :| I I I : I III :: : I ::: I 

Db 24 FLVEPVDTLSVRGSSVILNCS-AYSEPSPNIEWKRDGT FLNLVSDDRRQLLPDG 76 

Qy 217 NLLISNV— :— EPIDEGNYKCIA-QNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLY 268 

:! Mil ; :! Ill I : I : I II II I III I II :|: : 
Db 77 SLFISNWHSKHNKP-DEGFYQCVATVDNL-GTIVSRTAKLAVAGLPRFTSQPEPSSIYV 134 

Qy 269 GQTATFHCSVGGDPPPRVLWKKEEGNIPVSRARILHDEK SLEISNITPTDEGT 321 

I : :l I I I I !:: :| :| |:: :| III I III 

Db 135 GNSGILNCEVHADLVPFVRWEQ NRQPLLLDDRIVKLPSGTLVISNATEGDEGL 187 

Qy 322 YVC ' EAHNNVGQISARASLIVHAPPNFTRRPSN- -KKVGLNGWQLPCM 367 

II II I I :| I III: I :| : I llh 

Db 188 YRCIVESGGPPKFSDEAELKVLQESEEMLDLV FLMRPSSMIKVIGQSAV-LPCV 240 

Qy 368 ASGNPPPSVFWTREGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDS 427 

III I I : I I : : I II : I hhhll ::| I I I I h 

Db 241 ASGLPAPVIRWMR- - -NEDVLDTESSGRLALLAGGSLEISDVTEDDAGTYFC — VADN 293 

Qy 428 STVRVFLQVSSVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFHDGHAVQA 487 

: I , : II :; Ml : III hi = 11 :l I 
Db 294 GNKTIEAQAELTVQVPPEFLR-QPANIYARESMDIVFECEVTGKPAPTVRWVKNGDWIP 352 

Qy 488 GNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTSLHRAADPST 547 

: : !:: ;hl I II I I I I : I II:: II 
Db 353 SDYFRIVREHIILQVLGLVRSDEGFYQCIAENDVGNAQAGAQLIILE HAPATTGP 405 

Qy 548 YPAPPGTPKV7jNVSRTSISLRWAKSQERPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVG3 £07 

hi MINI I I : hi I : : : h 
Db 407 LPSAPRDWASLVSTRFIRLTWRTPASDPH--GDNLTYSVFYTKEGVARERVENTSQPGE 45 i 

Qy 608 TQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIRTIEADFDAASANDLSAARTLLTG 667 

llll I I I hi I hi I II h :| I 
Db 465 MQVT IQNLMPATVY IFKVMAQNKHG — SGESSAPLRVE TQ 502 

Qy 668 KSVEL •IDASAINASAVRLEWMLHVSADERYVEGLRIHYKDASVPSAQYHSITVM 721 

hi III I :| : : I : I : 

Db 503 PEVQLPGPAPNIRAYATSPTSITVTKETPLSGNGE'IQNYRLYYMEKGTDREQ DV 556 

Qy 722 DASAESFWGNLRRYTRYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQ 781 

I h I: : Mlllhl I : : : I : I III I h : : I 
Db 557 DVSSHSYTINGLKKYTEYSFRWAYNKHGPGVSTQDVAVRTLSDVPSAAPQNLSLEVRNS 616 

Qy 782 TAGWVRWTPPPSQHHNGNLYGYRICTSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSV 841 

: : I ll'l II : llll : : : I I h I I h 
Db 617 KSIVIHWQPPSSATQNGQITGYKIRYRRASRRSDVTETWTGTQLSQLIEGLDRGTEYNF 676 

Qy 842 RLNSFTRAGDGPYSKPIS--LFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPG 899 

I: : I I II : :| I llll 
Db 677 RVAALTVNGTGPATDWLSAETFESDLDESRVPEV-PSSLH 715 

Qy 900 DINPTTHKRTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTK- - ELGHLSWS 957 

: I || | :| : :: : : :|: 
Db 716 -VRP : LVTSIWSKTPPENQNIWRGYAIGY 744 

Qy 958 DNEITALNINSRESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSDGG-— 1013 

: ::: :|: : : I : I I 

Db 745 GIGSPHAQTIRVDYKQRYYTIENLDPS SHYVITLKAFNNVGEGIPLY 791 

Qy 1014 TDYAEVDTRNLTTFYNCRRSPDNPTPYATTMI - - IGTSSSETCTRTTSIS -A 1062 

ll.:MI : :1 I I I I: :| : Ihl 

Db 792 ESAVTRPHTDTSEVDLFVI NAP YT PVPDPT PMMPPVGVQAS ILSHDTIRITWA 844 

Qy 1063 DRDSGTHSPYSDA FAGQVPAVPWKS NYLQYPVEPINWSEF 1103 

I I ,:h : :|| h :|| ::| II 
Db 845 DNSLPKHQRITDSRYYTVRWKTNIPANTKYRNANATTLSYLVTGLKPNTLYEFSVMVTKG 904 
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Qy 1104 LPPPPEHPPPSSTYGYAQGSPES SRKSSKSAGSGI- -S 1139 

INI :| I : I : I I I I 

Db 905 RRSSTWSMTAHGATFELVPTSPPKDVWSKEGKPRTIIVNWQPPSEANGKITGYIIYYS 964 

Qy 1140 TNQSILNASIHSSSSGGFSAWGVSP QYAVACPPENVYSNPLSA 1182 

I: :ll II I : I I II: 

Db 965 TD— VNABIHD WVIEPWGNRLTHQIQELTLDTPYYFKIQARN--SKGMGP 1011 

Qy 1183 VAGGTQNRYQ ITPTNQHPPQLP - - AYFATTGPGGAVP PNHLP — 1222 

I I II ::| : I II :| I: I 

Db 1012 MSEAVQFR— TPRADSSDKMPNDQALGSAGKGGRLPDLGSDYKPPMSGSNSPHGSPTSP 1068 



Qy 1223 



■ 1250 



Db 



FATQRHAASEYQAGINAARCAQSRACNS- - 

11:1::: : II I 

1069 LDSNMLLVIIVSIGVITIVWVIIAVFCTRRTTSHQKK KRAACKSVNGSHK 1119 



iy 1251 



-CDALATPS PMQP— -PPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQ 1294 
I : I ::| II II h II * I 
1120 YKGNCKDVKPPDLWIHHERLELKPIDKSPDPNPVMTD--TPIPRNSQDIJPV 1169 



Qy 1295 CSSECSDHSRSSQSHRRQLQLEEHGSS AKQRGGHHRRRAPV-VQPCMESENENM 1347 

1:1 I 1:1: I I :|| : I II :| 

Db 1170 DNSMD8NIHQRRNSYRGHESEDSMSTLAGRRGMRPKMMMPFDSQPPQQSVRNTP 1223 

Qy 1348 LAEYEQRQYTSDCCNSSR--EGDTCSCSEGSCLYAEAGEPAP 1387 

: : II : II I I I ::|: I 
Db 1224 STDTMPASSSQTCCTDHQDPEGATSSSYLASSQEEDSGQSLP 1265 



RESULT 14 
P97798 



i 



P97798 PRELIMINARY; PRT; 1493 AA. 
P97798; 

01-MAY-1997 (TrEMBLrel, 03, Created) 
01-MAY-1997 (TrEMBLrel. 03, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
NEOGENIH (NEOGENIN PROTEIN) . 
NEOl. 

Mus musculus (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathl; Muridae; Murinae; Mus, 
NCBI TaxID-10090; 
[1] 

SEQUENCE FROM N.A. 
MEDLINE-97407661; PubMed-9264410; 
Keeling s.L,, Gad J.M., Cooper iH.M. ; 

"Mouse Neogenin, a DCC-like molecule, has four splice variants and is 

expressed widely in the adult mouse and during einbryogenesis,"; 

Oncogene 15:691-700(1997). 

EMBL; Y09535; CAA70727.1; ■. 

HSSP; P02751; 1TTF. 

MGD; MGI: 1097159; Neol. 

INTERPRO; IPR000531; -. 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003006; •. 

PFAM; PF00041; fn3; 6. 

PFAM; PF00047; ig; 4. 

PRINTS; PR00014; FNTYPEIII. 

PROSITE; PS00430; TONB_DEPENDENT_REC_l ; ONKNOWN.1, 

SEQUENCE 1493 AA; 163159 MW; 441DE919D5E17C0E CRC64; 



Query Match 9.8%; Score 731.5; DB 11; Length 1493; 

Best Local Similarity 22,6%; Pred. No. 1.5e-40; 

Matches 325; Conservative 181; Mismatches 538; Indels 395; Gaps 57; 

Qy 124 EQDGGEYWCVAKNR VGQAVSRHASLQIAVLRDD 156 

I:: I I : :| :|: I |: : I 

Db 4 EREAGRLLCTSSSRRCCPPPPLLLLLPLLLLLGRPASGAAATKSGPRRQSQGASVRTFTP 63 

Qy 157 - - FRVEPKDTRVAKGETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVD 214 

I III II :| : :| I I I : I III :: : I ::: 



Db 64 FYFLVEPVDTLSVRGSSVILNCS-AYSEPSPNIEWKKDGT FLNLESDDRRQLLP 116 

Qy 215 GGNLLISNV EPIDEGNYKCIA--QNLVGTRESSYAKLIVQVKPYFMKEPKDQVM 266 

1:1 Mil :l III 1:1:1 II II I III I II :|: : 
Db 117 DGSLFISNWHSKHNKP-DEGFYQCVATVDNL-GTIVSRTAKLTVAGLPRFTSQPEPSSV 174 

Qy 267 LYGQTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDEK SLEISNITPTDE 319 

1:1 :M I II I:: :| :| |:: :| III I I 

Db 175 YVGNSAILNCEVNADLVPFVRWEQ NRQPLLLDDRIVKLPSGTLVISNATEGDG 227 

Qy 320 GTYVCEAHN-NVGQISARASLIVHAPPN FTKRPSN- - KKVGLNGWQLPCMASG 370 

III : : I I M I I III: I I : I III: II 
Db 228 GLYRCIVESGGPPKFSDEAELKVLQDPEEIVDLVFLMRPSSMMKVTGQSAV--LPCWSG 285 

Qy 371 NPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTV 430 

I II I I : : I II : I I hhll ::| I I I : I: 

Db 286 LPAPWRWMK? • -NEEVLDTESSGRLVLLAGGCLEISDVTEDDAGTYFC IADNGNK 338 

Qy 431 RVFLQVSSVDERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIKWFHDGHAVQAGNR 490 

II : II III : I II hi :N :| I : 

Db 339 TVEAQAELTVQVPPGFLK-QPANIYAHESMDIVFECEVTGKPTPTVKWVKNGDWIPSDN 397 

Qy 491 YSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTV EKPGST 537 

: I:: :|:l I II I I II : I I I : :| 
Db 398 FKIVKEHNLQVLGLVKSDEGFYQCIAENDVGNAQAGAQLIILEHDVAIPTLPPTSLTSAT 457 

Qy 538 SLHRAADPSTYPAPPGTPKVLNVSRTS* ■ -ISLRWAKSQEKPGAVGPIIGYTVEYFSPDL 594 

: I I 1:1 I I: : I I I I I I I : |:| I : 
Db 458 TDHLA- -PATTGPLPSAPRDWASLVSTRFIKLTWRTPASDPH- -GDNLTYSVFYTKEGV 513 

Qy 595 QTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAAS 654 

: :.l: Mil I I I hi I hi I II h :| 
Db 514 DRERVENTSQPGEMQVTIQNLMPATVYIFKVMAQNKHG — SGESSAPLRVE 562 

Qy 655 ANDLSAARTLLTGKSVEL IDASAINASAVRLEWMLHVSADEKYVEGLRIHYKDA 708 

.1 hi I I I : ::: : I :| : : :: :::| : 
Db 563 jTQPEVQLPGPAPNIRAYATSPTSITVTWETPLSGNGE-IQNYKLYYMEK 610 

Qy 709 SVPSAQYHSITVMDASAESFWGNLKRY.TKYEFFLTPFFETIEGQPSNSKTALTYEDVPS 768 

I . :| h h : llllhl I : : : I : I Mil 

Db 611 GTDREQ -DIDVSSHSYTINGLKKYTEYSFRWAYNKHGPGVSTQDVAVRTLSDVPS 665 

Qy 769 APPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSV 828 

I M: Ml II : Mil : : : I 

Db 666 AAPQNLSLEVRNSKSIVIHWQPPSSTTQNGQITGYKIRYRKASRKSDVTETLVTGTQLSQ 725 

Qy 829 LLNNLTTGAVYSVRLNSFTKAGDGPYSKPIS--LFMDPTHHVHPPRAHPSGTHDGRHEGQ 886 

I: I I h h : I I II : :l I I III 
Db 726 LIEGLDRGTEYNFRVAALTVNGTGPATDWLSAETFESDLDETRVPEV-PSSLH 777 

Qy 887 DLTYHNNGNIPPGDINPTTHKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQM 946 

' : I || | :| : :: : 

Db 778 —VRP LVTSIWSWTPPENQNIV 798 

Qy 947 TK-ELGHLSWSDNEITALNINSKESLWIDHHRGWRTADTDKDSGLSESKLLSHVNSSQ 1004 

: :|: ; : ::: :|: : : I : I II : 

Db 799 VRGYAIGY-.^ GIGSPHAQTIKVDYKQRYYTIENLDPS SHYVITL 840 

Qy 1005 SNYNNSDGG- TDYAEVDTRNLTTFYNCRKSPDNPTPYATTMI - - IGTSS 1050 

:M I • II :IM : ;| | I I I: :| : 

Db 841 KAFNNVGEGIPLYESAVTRPHTDTSEVDLFVI NAPYTPVPDPTPMMPPVGVQA 893 

Qy 1051 SETCTKTTSIS - ADKDSGTHSPYSDA FAGQVPAVPWKS NYLQYPVEP 1097 

I I I': II I :h : :|l I: :|| ::l 
Db 894 SILSHDTIRrrWADNSLPKHQKITDSRYYTVRWRTNIPANTKYKNANATTLSYLVTGLKP 953 

Qy 1098 INWSEF — • LPPPPEHPPPSSTYGYAQGSPES SR 112?: 

II " I II I :| I : I 

Db 954 NTLYEFSVMVTKGRRSSTWSMTAHGATFELVPTSPPKDVTWSKEGKPRTIIVNWQPPSE 1013 

Qy 1129 KSSKSAGSGI-STNQSILNASIHSSSSGGFSAWGVSP QYAVA 1169 

I I I' I: :|| II I : I I 

Db 1014 ANGKITGYIIYYSTD — VNAEIHD WVIEPWGNRLTHQIQELTLDTPYYFK 1062 



Mon Jan 22 13:04:21 2001 



us-09-540-245a-15.rspt 



Page 1 



Qy 1170 CPPENVYSNPLSAVAGGTQNRYQITPTNQHPPQLP--AYFATTGPGGAVP 1217 

I I : f- I I II ::| : I I :| 
Db 1063 IQARN—SKGMGPMSEAVQFR — TPKADSSDKMPNDQALGSAGKGSRLPDLGSDYKPPM 1117 

Qy 1218 PNHLP FATQRHAASEYQAGLNAARCA 1243 

1:1 I |:| : : : 
Db 1118 SGSNSPHGSPTSPLDSNMLLVIIVSVGVITIVVVVVIAVFCTRRTTSHQKK 1168 

Qy 1244 QSRACNS CDALATPS PMQP — PPPVPVPEGWYQPVHPNSH 1281 

: II I 1:1 ::l I I II hi 

Db 1169 RRMCKSVNGSHKYRGNCRDVKPPDLWIHHERLELRPIDRSPDPNPVMTD--TPIPRNSQ 1226 

Qy 1282 PMHPTSSNHQIYQCSSECSDHSRSSQSHKRQLQLEEHGS • -SAKQRGGHHRRRAPWQP 1338 

: I 1:1 I 1:1: III I I :: I 

Db 1227 DITPV DNSMDSNIHQRRNSYRGHESEDSMSTLAGRRGMRPRMMMP 1271 



ID 



RESULT 15 
000340 

ID 000340 PRELIMINARY; PRI; 1461 AA, 
000340; 

01-JVL-1997 (TrEMBLrel . 04, Created) 
01-JDL-1997 (TrEMBLrel. 04, Last sequence update) 
DT 01-OCT-2000 (TrEMBLrel, 15, Last annotation update) 
DE NEOGENIN. 

OS Homo sapiens (Human). ! 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; CatarrHini; Hominidae; Homo. 

OX NCBI TaxID-9606; 

RN m 

RP SEQUENCE FROM N.A, 
RC TISSUE-BRAIN; * 
RX MEDLINE-97312699; PubMed-9169140; 

RA Vielmetter J., Cheng X.N., Miskevich p., Lane R.P., Yamakawa K., 
RA Korenberg J.R., Dreyer W.J.; 

RT "Molecular characterization of human neogenin, a DCC-related protein, 
RT and the mapping of its gene (NEOl) to chromosomal position 15q22.3- 
RT q23.\- 

RL Genomics 41:414-421(1997). 
DR EMBL; 072391; AAC51287.1; -. 
DR HSSP; P02751; 1TTF. 
DR INTERPRO; IPR000531; -. 
DR INTERPRO; IPR001777; ■. 
DR INTERPRO; IPR003006; -. 
DR PFAM; PF00041; fn3; 6. 
DR PFAM; PF00047; ig; 4. 
DR PRINTS; PR00014; FNIYPEIIl'. 
DR PROSITE; PS00430; TONB_DEPENDENT_REC_l; UNRNOWN.l, 
" SEQUENCE 1461 AA; 160015 MW; 4AADF1EEBCAFD82C CRC64; 



Query Match 9.6%; Score 712.5; DB 4; Length 1461; 

Best Local Similarity 22.3%; Pred, No. 2.8e-39; 

Matches 313; Conservative 184; Mismatches 564; Indels 341; Gaps 48; 

Qy 157 FRVEPRDTRVARGETALLECGPPKGIPEPTLIWIRDGVPLDDLRAMSFGASSRVRIVDGG 216 

I III II :| : :| I I I : I III :: : I ::: I 

Db 55 FLVEPVDTLSVRGSSVILNCS-AYSEPSPKIEWKKDGT FLNLVSDDRRQLLPDG 107 

Qy 217 NLLISNV EPIDEGNYKCIAQ-NLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYG 269 

:| llll :l III hhl :ll I UNI I I :|: : I 
Db 108 SLFISNWHSRHNKP-DEGYYQCVATVESLGTIISRTAKLIVAGLPRFTSQPEPSSVYAG 166 

Qy 270 QTATFHCSVGGDPPPKVLWKKEEGNIPVSRARILHDEK SLEISNITPTDEGTY 322 

I :l I I I I I:: H :l h: I III I III 

Db 167 NNAILNCEVNADLVPFVRWEQ NRQPLLLDDRVIRLPSGMLVISNATEGDGGLY 219 

Qy 323 VCEAHN- NVGQ ISARASLIVHAPPN FTKRPSNKKVGLNGWQLPCMASGNPPPS 375 

I : : I I I I I 1:11 : I 1 1 1 : 1 1 1 I |: 
Db 220 RCWESGGPPKYSDEVELKVLPDPEVISDLVFLKQPSPLVRVIGQDWLPCVASGLPTPT 279 

Qy 376 VFWTK--EGVSTLMFPNSSHGRQYVAADGTLQITDVRQEDEGYYVCSAFSWDSSTVRVF 433 



: I I I :,■ I 
280 IKWMKNEEALDT-- 



I I : I hhhll ::| I I I : I: : 
■ - ESSERLVLLAGGSLEISDVTEDDAGT YFC - - "IADNGNETIE 330 



Qy 434 LQVSSVDERPPPIIQIGPANQTLPKGSYATLPCRATGNPSPRIKWFHDGHAVQAGNRYSI 493 

I : I :: I I : I II I : I : 1 1 : 1 I : : I 

Db 331 AQAELTVQAQPEFLK-QPTNIYAHESMDIVFECEVTGKPTPTVKWVKNGDMVIPSDYFKI 389 

Qy 494 IQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTSLHRAADPSTYPAPPG 553 

:: :|:| HUM: | : : | | |: | 

Db 390 VKEHNLQVLGLVKSDEGFYQCIAENDVGNAQAGAQLIILE HAPATTGPLPSAPR 443 

Qy 554 TPKVLNVSRTSISLRWMSQERPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQVTIS 613 

ll'lll I I : |:| I : : |: llll 
Db 444 DWASLVSTRFIKLTWRTPASDPH-GDNLTYSVFYTKEGIARERVENTSHPGEMQVTIQ 501 



Db 



614 GLTPGTSYVFLVRAENTQGISVPSGLSNVIKTIEADFDAASANDLSAARTLLTGKSVEL- 672 

I I I M i hi I II I: :l I hi 

502 NLMPAT VY I FRVMAQNKHG — SGESSAPLRVE TQPEVQLP 539 



Qy 673 IDASAI NASAVRLEWMLHVSADEKYVEGLR I HYKDASVPSAQYHS ITVKDAS AES 727 

: I I'-: ::: : I II : : :: :::| : I :| |: I 
Db 540 GPAPNLRAYAASPTSITVTWETPVSGNGE-IQNYRLYYMEKGTDREQ DVDVSSHS 593 

Qy 728 FWGNLKKYTKYEFFLTPFFETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVR 787 

: : INI: I : : : I : I HIM I |: : : I : : 
Db 594 YTINGLRKYTEYSFRWAYNKHGPGVSTPDVAVRTLSDVPSAAPQNLSLEVRNSKSIMIH 653 

Qy 788 WTPPPSQHHNGNLYGYKIEVSAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFT 847 

III II : llll : : : : I I I : I I I : I : : I 
Db 654 WQPPAPATQNGQITGYKIRYRKASRKSDVTETLVSGTQLSQLIEGLDRGTEYNFRVAALT 713 

Qy 848 KAGDGPYSKPIS-LFMDPTHHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTT 905 

I II : ;| I I II I : I 

Db 714 INGTGPATDWLSAETFESDLDETRVPEV-PSSLH VRP" 749 

Qy 906 HKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFRRRHQMTR--ELGHLSVVSDNEITA 963 

II I :| : :: : : :|: 
Db 750 LVTSIWSWTPPENQNIWRGYAIGY G 776 

Qy 964 LNINSKESLWIDHHRGWRTADTDRDSGLSESRLLSHVNSSQSNYNNSDGG 1013 

: ::: :|: : : I : | || : :|| I 

Db 777 IGSPHAQTIKVDYKQRYYTIENLDPS SHYVITLKAFNNVGEGIPLYESAVTR 828 

Qy 1014 -TDYAEVDTRNLTTFYNCRRSPDNPTPYATTMI-IGTSSSETCTRTTSIS-ADRDSGT 1068 

II :IH . : M I I I I: :l :l I |: II 

Db 829 PHTDTSEVDLFVI NAPYTPVPDPTPMMPPVGVQASILSHDTIRITWADNSLPK 881 

Qy 1069 HSPYSDA- - - ; ■ -FAGQVPAVPWKS NYLQYPVEPINWSEF 1103 

I :|: : :|l |: :|l ::| II 
Db 882 HQKITDSRYYTVRWKTNIPANTKYKNANATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTW 941 

Qy 1104 LPPPPEHPPPSSTYGYAQGSPES SRKSSKSAGSGI-STNQSIL 1145 

-Mil :| I:: hill lh : 

Db 942 SMTAHGTTFELVPTSPPKDVTWSKEGKPRTIIVNWQPPSEANGKITGYIIYYSTD---V 998 

Qy 1146 NASIH : SSSSGGFSAWGVSPQYAV-ACPPEN 1174 

llll', : :| | : |: : 

Db 999 NAEIHDWVIEPWGNRLTHQIQELTLDTPYYFKIQARNSRGMGPMSEAVQFRTPRADSSD 1058 

Qy 1175 VYSNPLSAVAGGTQNRYQITPTNQHPPQLPAYFATTGPGGAVPPNHL 1221 

I ::■:!! :| :: II : I : I I 
Db 1059 RMPNDQASGSGGKGSRLPDLGSDYKPPMSGSNSPHGSPTSPLDSNMLLVIIVSVGVITIV 1118 

Qy 1222 PFATQRHAASEYQAGLNAARCAQSRACNSCDA — LATPSPMQPP 1253 

I hi : : : : II I : ::|| 

Db 1119 WVIIAVFCTRRTTSHQKK RRAACKSVNGSHKYKGNSKDVKPPDLWIHHER 1169 

Qy 1264 PPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQCSSECSDHSRSSQSHRRQLQL 1315 

III I: II : I hi I hh 

Db 1170 LELRPIDRSPDPNPIMTDTPIPRNSQDITPV DNSMDSNIHQRRNSY 1215 

Qy 1316 EEHGSS -ARQRGGHHRRRAPV VQPCMESENENMLAEYEQRQYTSDCCNSSR 1365 

II : I HI : I II : : : I ::| : :| 
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Db 1216 RGHESEDSMSTLAGRRGMRPKMMMPFDSQPPQPVISAHPIHSLDNPHHHFHSSSLASPM 1275 

Qy 1366 EGDTCSCSEGSCLYAEAGEPAP 1387 

Ml I I I 
Db 1276 SHLY-HPGSPWP 1286 



Search completed: January 22, 2001, 12:51:13 
Job time: 1874 sec 



» 



Mon Jan 22 13:04:22 2001 
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us-09-540-245a-16.rag 



GenCore version 4,5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM protein ■ protein search, using sw model 
Run on: January 22, 2001, 12:17:03 ; 



Title: US-09-540-245A-16 

Perfect score: 7272 

Sequence: 1 GENPRIIEHPMDTTVPKNDP. . 



Search time 233.01 Seconds 

(without alignments) 

202.659 Million cell updates/sec 



, .RSLLSNSGSGTSSQPAGHNV 1381 



Scoring table: 



^^rched: 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 
268485 seqs, 34193795 residues 



Total number of hits satisfying chosen parameters: 268485 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 1001 
Listing first 45 summaries 



Database : 



Pred. no. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



* 

Query 



,DAT: 

/SSDSl/gcgdata/genes eq/genes eqp/AAl 981 .DAT : 
/SlDSl/gcgdata/geneseq/geneseqp/AA1982 .DAT : 
/SIDSl/gcgdata/geneseq/geneseqp/AAl983 .DAT : 
/SIDSl/gcgdata/geneseq/geneseqp/AA1984 .DAT ; 
/SIDSl/gcgdata/geneseq/geneseqp/AA1985 .DAT : 
/SlDSl/gcgdata/geneseq/geneseqp/AAl986 .DAT : 
/SIDSl/gcgdata/geneseq/geneseqp/AA1987 .DAT : 
/S IDS 1/gcgda ta/genes eq/genes eqp/AAl 986 .DAT : 
/SlDSl/gcgdata/geneseq/geneseqp/AA1989.DAT: 
/SIDSl/gcgdata/geneseq/geneseqp/AA1990.DAT: 
/SIDSl/gcgdata/geneseq/geneseqp/AA1991 .DAT: 
/SIDSl/gcgdata/geneseq/geneseqp/AA1992 .DAT: 
/SIDSl/gcgdata/geneseq/geneseqp/AAl993 .DAT:' 
/SlDSl/gcgdata/geneseq/geneseqp/AA1994 .DAT:' 
/SIDSl/gcgdata/geneseq/geneseqp/AAl995 .DAT:' 
/SIDSl/gcgdata/geneseq/geneseqp/AA1996 .DAT:' 
/SIDSl/gcgdata/geneseq/geneseqp/M1997.DAT:' 
/SIDSl/gcgdata/geneseq/geneseqp/AA1998 .DAT : ■ 
/SIDSl/gcgdata/geneseq/geneseqp/AA1999 .DAT : ■ 
I. DAT: 



NO. 


Score 


Match Length 


DB 


ID 


Description 


PR 
XX 
PA 


14-NOV-1997; 97US-0065543. 


1 


7272 


100.0 


1381 


20 


Y13564 


Drosophila Robo 2 


(REGC ) ONIV CALIFORNIA, 


2 


7256.5 


99.8 


1380 


20 


Y08402 


Drosophila sp. ROB 


XX 




3 


1786.5 


24.6 


1395 


20 


Y13563 


Drosophila Robo 1 


PI 


Goodman C, Rid T, Mitchell KJ, Russell C, Tear G; 


4 


1786.5 


24.6 


1395 


20 


Y08401 


Drosophila sp. ROB 


XX 




5 


1498 


20.6 


1651 


20 


Y13566 


Human Robo 1 polyp 


DR 


WPI; 1999-338003/28. 


6 


1491 


20.5 


1649 


20 


Y08404 


Human ROBOl protei 


DR 


N-PSDB; X55768. 


7 


1344.5 


18.5 


1297 


20 


Y13565 


C. elegans Robo po 


XX 




8 


1344,5 


18.5 


1297 


20 


Y08403 


C, elegans ROBO pr 


PT 


Modulation of Robo-Comm polypeptide interactions 


9 


1261 


17.3 


753 


20 


W83927 


Human T85 protein. 


XX 




10 


557.5 


7.7 


1728 


12 


R13144 


Deleted in Colorec 


PS 


Disclosure; Page 34-38; 56pp; English, 


11 


551.5 


7.6 


1447 


16 


R68553 


Deleted in colorec 


XX 


12 


551.5 


7.6 


1447 


20 


Y33498 


Human DCC protein. 


cc 


The invention relates to a method for modulating the amount of Com 



Page i 



13 


548.5 


7.5 


1192 


19 


W57900 


Protein of clone C 


14 


543.5 


7.5 


1028 


19 


W29667 


Homo sapiens DL1S5 


15 


538 


7.4 


1571 


19 


W42087 


Human Down syndic?. 


16 


538 


7.4 


1910 


19 


W42086 


Human Down syndrom 


17 


536.5 


7.1 


1018 


15 


R63759 


Human contactin (E 


18 


536.5 


7.4 


1018 


17 


R87028 


Human contactin. 


19 


531.5 


7.3 


1496 


20 


W81030 


Melanoma associate 


20 


531.5 


7.3 


1496 


21 


Y70469 


Human p53 target s 


21 


524 


7.2 


1018 


18 


W06485 


Rat contactin liga 


22 


509.5 


7.0 


1299 


21 


Y40439 


Human Nr-CAM prote 


23 


509 


7.0 


1257 


20 


W74152 


Human LI cell adh; 


24 


495.5 


6.8 


1897 


21 


Y81785 


Human protein tyro 


25 


495.5 


6.8 


1897 


21 


Y56100 


LAR tyrosine phosp 


26 


495 


6.8 


1304 


19 


W59994 


Human neural cell 


27 


493.5 


6.8 


1911 


16 


R71726 


Human PTP-OB. Horn 


28 


493.5 


6.8 


1911 


18 


W27225 


Human protein tyro 


29 


493.5 


6,8 


1911 


20 


W94027 


Human protein tyro 


30 


484 


6.7 


3117 


21 


Y53667 


Sequence gi/332818 


31 


475 


6.5 


1242 


19 


W52287 


Rattus norvegicus 


32 


473.5 


6.5 


434 


20 


Y13567 


Human Robo 2 polyp 


33 


473.5 


6.5 


434 


20 


Y08405 


Human partial ROBO 


34 


448.5 


6.2 


4412 


21 


Y53666 


Sequence gi/101742 


35 


438 


6.0 


1225 


19 


K52289 


Homo sapiens cdo t 


36 


431 


5.9 


1501 


16 


R72858 


Rat receptor type- 


37 


420.5 


5.8 


761 


17 


R92255 


Neural cell adhesi 


38 


415.5 


5.7 


1251 


19 


W37778 


Rattus norvegicus 


39 


412.5 


5.7 


1070 


18 


W08747 


Human colon carcin 


40 


388.5 


5.3 


848 


21 


Y88565 


Human NCAM 140kD i 


41 


382 


5,3 


1853 


21 


Y53668 


Protein 608 sequen 


42 


382 


5.3 


2387 


21 


Y53665 


Mechanical stress 


43 


382 


5.3 


2597 


21 


Y53664 


Mechanical stress 


44 


370.5 


5.1 


1125 


19 


W52288 


Rattus norvegicus 


45 


370.5 


5.1' 


1139 


19 


W37779 


Rattus norvegicus 



RESULT 1 
Y13564 

ID Y13564 standard; Protein; 1381 AA. 
XX 

AC Y13564; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE Drosophila' Robo 2 polypeptde. 
XX 

KW Comm polypeptide; Robo polypeptide; commissureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Drosophila sp. 
XX 

PN W09925833-A1. 
XX 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; 98WO-OS24327. 
XX 
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CC (commissureless) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Comm polypeptide in contact with the cell, where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Com in contact with the cell. 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo:Comm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function, 
xx 

SQ Sequence 1381 AA; 



Query Match 100.0%; Score 7272; DB 20; Length 1381; 

Best Local Similarity 100.0*; Pred. No, 0; 



Matches 
Qy i 


Db 

( 


l 

61 


^b 


61 


oy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


|301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 




481 


l 


481 


Qy 


541 


Db 


541 


Qy 


601 


Db 


601 


Qy 


661 


Db 


661 


Qy 


721 


Db 


721 


Qy 


781 


Db 


781 



Qy 841 TIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPyCVPATLRLDPITKRLDPFINQRDHV 900 
Db 841 tidaasptlvianltegvmytvgvaagnnagvgpycvpatlrldpitkrldpfinqrdhv 900 
Qy 901 NDVLTQPWFIILIiGAILAVLMLSFGAMVFVKRKHMMMKQSALNTMRGNHTSDVLKMPSLS 960 
Db 901 ndvltqpwfiillgailavlmlsfgamvfvkrkhmmmkqsalntmrgnhtsdvlkmpsls 960 
Qy 961 ARNGNGYWLDSSTGGMVWRPSPGGDSLEMQKDHIADYAPVCGAPGSPAGGGTSSGGSGGA 1020 
Db 961 arngngywldsstggmvwrpspggdslemqkdhiadyapvcgapgspagggtssggsgga 1020 
.021 GSGASGGDDIHGGHGSERNQQRYVGEYSNIPTDYAEVSSFGKAPSEYGRHGNASPAPYAT 1080 
.021 gsgasggddihgghgsernqqryvgeysniptdyaevssfgkapseygrhgnaspapyat 1080 
SSILSPHQQQQQQQPRYQQRPVPGYGLQRPMHPHYQQQQHQQQQAQQTHQQHQALQQHQQ 1140 
.081 ssilsphqqqqqqqpryqqrpvpgyglqrpmhphyqqqqhqqqqaqqthqqhqalqqhqq 1140 
.141 LPPSNIYQQMSTTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIHITENKLSNCHTYEAAPG 1200 
.141 lppsniyqqmsttseiyptntgpsrsvyseqyyypkdkqrhihitenklsnchtyeaapg 1200 
.201 AKQSSPISSQFASVRRQQLPPNCSIGRESARFKVLNTDQGKNQQNLLDLDGSSMCYNGLA 1260 
.201 akqsspissqfasvrrqqlppncsigresarfkvlntdqgknqqnlldldgssmcyngla 1260 
.261 DSGCGGSPSPMAMLMSHEDEHALYHTADGDLDDMERLYVKVDEQQPPQQQQQLIPLVPQH 1320 
.261 dsgcggspspmamlmshedehalyhtadgdlddmerlyvkvdeqqppqqqqqliplvpqh 1320 
.321 PAEGHLQSWRNQSTRSSRKNGQECIKEPSELIYAPGSVASERSLLSNSGSGTSSQPAGHN 1380 
.321 paeghlqswruqstrssrkngqecikepseliyapgsvasersllsnsgsgtssqpaghn 1380 



.381 V 1381 

I 

.381 v 1381 



RESULT 2 
Y08402 



'08402 standard; Protein; 1380 AA. 
:08402; 

I4-JUL-1999 (first entry) 

Drosophila sp. .ROB02 extracellular domain protein. 



KW R0B01; ROB02; roundabout; nerve guidance; human; murine; cell function; 
KW cell morphology; screening assay, 



Drosophila sp. 
WO9920764-A1: ' 



XX 
OS 
XX 
PN 
XX 

PD 29-APR-1999. ■ 

PF 20-OCT-1998; '98WO-OS22164. 
XX 

PR 14-NOV-1997; 97US-0971172. 

PR 20-OCT-1997; 97US-0062921. 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Goodman CS, Kidd T, Mitchell KJ, Tear G; 
XX 

DR DPI; 1999-312615/26. 

DR N-PSDB; X57251, 
XX 

PT Robo polypeptides, a new immunoglobulin superfamily member 



«. Best Available Copy 
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PS Claim 1; Page 52-56; 80pp; English. 
XX 

CC This invention describes novel Robo (roundabout) polypeptides, involved 

CC in nerve guidance which heve been isolated from Drosophila sp., 

CC C. elegans, human and murine samples]. The products of the invention can 

CC be used to raise anti-Robo antibodi*, which can be used to modulate cell 

CC function or morphology. The Robo polynucleotides and fragments are useful 

CC as probes and primers and for production of the Robo polypeptides. The 

CC probes and primers are also useful ft) screening assays. 
XX ' 

SQ Sequence 1380 AA; 



Query Match 99.8%; Score 7256.5; DB 20; Length 13 80; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 1380; Conservative 0; Mismatches 0; Indels 1; Gaps 

Q 1 GENPRIIEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFRDGRELKTDTGSHRIMLPAGGL 60 

Db 1 genpriiehpmdttvpkndpftfncqaegnptptiqwfkdgrelktdtgshrimlpaggl 60 

Qy 61 FFLKVIHSRRESDAGTrWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVAL 120 

Db 61 fflkvihsrresdagtywceaknefgvarsmatlqvavlrdefrlepantrvaqgeval 120 

Qy 121 MECGAPRGSPEPQISWRKHGQTLNLVGNKRIRITOGGNUIQMQSDDGRYQCVVKKVV 180 

Db 121 mecgaprgspepqiswrkngqtlnlvgnkririvdggnlaiqearqsddgryqcwknw 180 

Qy 181 GTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLKRRTASGGNMPL 240 

Db 181 gtresataflkvhvrpflirgpqnqtawgsswfqcriggdplpdvlwrrtasggnmpl 240 

Qy 241 RKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADKAVGGITATGILTVHAPPKFV 300 

Db 241 rkfswlhsasgrvhvledrslklddvtledmgeytceadnavggitatgiltvhappkfv 300 

Qy 301 IRPKNQLVEIGDEVlFECQANGHPRPTLyWSVEGNSSLLLPGYRDGRMEVTLTPEGRSVL 360 

Db 301 irpknqlveigdevlfecqanghprptlywsvegnsslllpgyrdgrmevtltpegrsvl 360 

Qy 361 SIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQ6PVNQTLPVKSIW 4j20 

Db 361 siarfaredsgkwtcnalnavgsvssrtwsvdtqfelpppiieqgpvnqtlpvksiw 420 

Q 421 LPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRN 480 

Db 421 lpcrtlgtpvpqvswyldgipidvqeherrnlsdagaltisdlqrhedeglytcvasnrn 480 

Qy 481 GRSSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTLSWTRSNKVGGS 540 

Db 481 gksswsgylrldtptnpnikffrapelstypgppgkpqmvekgensvtlswtrsnkvggs 540 

Qy 541 SLVGYVIEMFGRNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMS 600 

Db 541 slvgyviemfgknetdgwavgtrvqnttftqtgllpgvnyffliraenshglslpspms 600 

Qy 601 EPITVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKLTWQIINGKYVEGFYVY 660 

Db 601 epitvgtryfnsgldlsearasllsgdwelsnasvvdstsmkltwqiingkyvegfyvy 660 

Qy 661 ARQLPNPIVMPAPVTSNTNPLLGSTSTSASASASASALISTKPNIAAAGKRDGETNQSG 720 

Db 661 arqlpnpivnnpapvtsntnpllgststsasasasasalistkpniaaagkrdgetnqsg 720 

Qy 721 GGAPTPLNTRYRMLTILNGGGASSCTITGLVQYTLYEFFIVPFYRSVEGKPSNSRIARTL 780 

Db 721 ggaptplntkyrmltilngggassctitglvqytlyeffivpfyksvegkpsnsriartl 780 

Qy 781 EDVPSEAPYGMEiLLLNSSAVFLKWKAPELKDRHGVLLNYHVIVRGIDTAHNFSRILTNV 840 

Db 781 edvpseapygmealllnssavflkwkapelkdrhgvllnyhvivrgidtahnfsriltnv 840 



Qy 841 TIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPITKRLDPFINQRDHV 900 

Db 841 tidaasptlvlanltegvmytvgvaagnnagvgpycvpatlrldpitkrldpfinqrdhv 900 

Qy 901 NDVLTQPWFIILLGAILAVLMLSFGAMVFVKRKHMMMKQSALNTMRGNHTSDVLKMPSLS 960 

Db 901 ndvltqpwfiillgailavlmlsfgamvfvkrkhmiMkqsalntmrgnhtsdvlkmpsls 960 

Qy 961 ARNGNGYWLDSSTGGMVWRPSPGGDSLEMQKDHIADYAPVCGAPGSPAGGGTSSGGSGGA 1020 

Db 961 arngngywldsstggmvwrpspggdslemqkdhiadyapvcgapgspagggtssggsgga 1020 

Qy 1021 GSGASGGDDIHGGHGSERNQQRYVGEYSNIPTDYAEVSSFGKAPSEYGRHGNASPAPYAT 1080 

Db 1021 gsgasggddihgghgsernqqryvgeysniptdyaevssfgkapseygrhgnaspapyat 1080 

Qy 1081 SS ILS PHQQQQQQQ PRYQQRP VPG YG LQRPMHPHYQQQQHQQQQAQQTHOQHQALQQHQQ 1140 

Db 1081 ssilsphqqqqqqqpryqqrpvpgyglqrpmhphyqqqqhqqqqaqqthqqhqalqqhqq 1140 

Qy 1141 LPPSNIYQQMSTTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIHITENKLSNCHTYEAAPG 1200 

Db 1141 lppsniyqqmsttseiyptntgpsrsvyseqyyypkdkqrhihi-enklsnchtyeaapg 1199 

Qy 1201 AKQSSPISSQFASVRRQQLPPNCSIGRESARFKVLNTDQGRNQQNLLDLDGSSMCYNGLA 1260 

Db 1200 akqsspissqfasvrrqqlppncsigresarfkvlntdqgknqqnlldldgssmcyngla 1259 

Qy 1261 DSGCGGSPSPMAMLMSHEDEHALYHTADGDLDDMERLYVRVDEQQPPQQQQQLIPLVPQH 1320 

Db 1260 dsgcggspspmamlmshedehalyhtadgdlddmerlyvkvdeqqppqqqqqliplvpqh 1319 

Qy 1321 PAEGHLQSWRNQSTRSSRKNGQECIREPSELIYAPGSVASERSLLSNSGSGTSSQPAGHN 1280 

Db 1320 paeghlqswrnqstrssrkngqecikepseliyapgsvasersllsnsgsgtssqpaghr. 1-75 

Qy 1381 V 1381 
I 

Db 1380 v 1380 ■ 



RESULT 3 
Y13563 

ID Y13563 standards Protein; 1395 AA. 
XX 

AC Y13563; 
XX 

DT 30-JDL-1999 (first entry) 
XX 

DE Drosophila Robo-'l polypeptde. 
XX \ 

RW Comm polypeptid's; Robo polypeptide; conunissureless; roundabout; 
KW modulation; nerve cell function. 
XX 
OS 
XX 



Drosophila sp. , 

PN W09925833-A1. ' 
XX 

PD 27-MAY-1999. ,! 
XX 

PF 13-NOV-1998; 98WO-US24327. 
XX 

PR 14-NOV-1997; 97OS-0065543 . 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Goodman C, Kid 'T, Mitchell KJ, 
XX 

DR WPI; 1999-338008/28. 

DR N-PSDB; X55767.' 



PT Modulation of Robo-Comm polypeptide interactions 



Best Available Copy 
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PS Disclosure; Page 30-33; 56pp; English. 

XX 1 

CC The invention relates to a method for modulating the amount of Coram 

CC (commissureless) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Comm polypeptide in contact with the cell, where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Comm in contact with the cell. 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo:Comm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function. 
XX 

SQ Sequence 1395 AA; 



Query Match 24, 6»; Score 1786.5; DB 20; Length 1395; 

Best Local Similarity 30.1%; Pred. No. l.le-101; 

Matches 441; Conservative 242; Mismatches 507; Indels 275; Gaps 39; 

tl ENPRIIEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKT-DTGSHRIMLPAGGL 60 
I I 11:1 I II: II I llhlllll : I : III: I I 
54 qspriiehptdlvvkknepatlnckvegkpeptiewfkdgepvstnekkshrvqfkdgal 113 

Qy 61 FFLKVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVAL 120 

II : : ::| I I III III I I 1 1 : 1 : 1 1 : 1 1 1 1 1 : 1 1 : 1 1;: II 1 1 : 1 1 II 

Db 114 ffyrtmqgkkeqdggeywcvaknrvgqavsrhaslqiavlrddfrvepkdtrvakgetal 173 

Qy 121 MECGAPRGSPEPQISWRKNG QTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQC 174 

:IM hi III : I 1:1 : :: : hllllllll I £ hi 1:1 
Db 174 lecgppkgipeptliwikdgvplddlkamsfgassrvrivdggnlHsnvepidegnykc 233 

Qy 175 WKNWGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTAS 234 

: :|:IMIII: I I I hh |::| : I : I I :IHt I III:: 
Db 234 iaqnlvgtressyaklivqvkpyfmkepkdqvmlygqtatfhcsvggdpppkvlwkk-e 291 

i 

Qy 235 GGNMPLRKFSWLHSASGRVHVXEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVH 294 

] I : I : : II :::||:: ::| III III I I! hi I II 

Db 292 egnipvsrarilh deksleisnitptdegtyvceahnnvgqisaraslivh 342 

Qy 295 APPKFVIRPKNQLVEIGDEVLFECQANGHPSPTLYWSVEGNSSLLLPGYRDGRMEVTLTP 354 

III I III: I : | | |:|:| h::h II hh I II I 

Db 343 appnftkrpsnkkvglngwqlpcmasgnpppsvfwtkegvstlmfpnsshgrqyva--- 399 

Qy 355 EGRSVLS I ARFAREDSGKVVTCNALNAVGSVSSRTVVSVDTQFELPPPI IEQGPVNQTLP 414 

II :ll I I hi : I I : I : I : I llllh II Mill 
Db 400 -adgtlqitdvrqedegyyv-csafswdsstvrvflqvssvderpppiiqigpanqtlp 457 

• 415 VKS I WLPCRTLGT PVPQVSWYLX I P I DVQEHERRNLSDAG ALT ISDLQRHEDEGLYTC 474 
h MM I I h: h II II I :: :| : III I I III 
458 kgsvatlpcratgnpspr ikwf hdgha - -vqagnrysiiqgsslrvddlql -sdsgtytc 514 

Qy 475 VASNRNGKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQMVERGENSVTLSWTRS 534 

II I : : 1 1 = I :: I : :: II : Ml III |::: |::| I :| 
Db 515 tasgergetswaatltvekpgstsl-hraadpstypappgtpkvlnvsrtsislrwaks 572 

Qy 535 NKVGGS ■ -SLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHG 592 

: I: ::M I I : lh II M I :ll II :| Ihlllh I 
Db 573 qekpgavgpiigytveyfspdlqtgwivaahrvgdtqvtisgltpgtsyvflvraentqg 632 

Qy 593 LSLPSPMSEPITVGTRYFN- -SGLDISEARASLLSGDWELSNASWDSTSMKLTWQI- • 648 

: 1 : 1 1 :| I hi III II :lhl III :|l :::::::! I : 
Db 633 isvpsglsnviktieadfdaasandlsaar-tlltgksvelidasainasavrlewmlhv 691 

Qy 649 -INGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTKPNIA 707 

: Mill :: : I 

Db 692 sadekyveglrihykdasvp 711 

Qy 708 AAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVPFYKSV 767 

: :| :h:: II : 1:1 Hh ||:::: 
Db 712 saqyhsitvmd-asaesfwgnlkkytkyeffltpffeti 750 



768 EGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDRHGVLLNYHVIVRGI 827 

Ihlllh l'l HH I :: : I :| :::| I ; ;| I I ; I 

751 egqpsnsktaltyedvpsappdniqigiynqtagwvrwtpppsqhhngnlygykiev- - - 807 

828 DTAHNFSRILTNVTIDAASPTLVLAKLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPIT 887 

:| I ::l hh:| : :::| III I :hl : : II III I :l Ml I 

808 -sagntmkvlanmtlnatttsvllnnlttgavysvrlnsftkagdgpyskpislfmdp-t 865 



Qy 888 KRLDP' 



FIHQRDH- -VNDVLTQPWFIILLGAI 916 

I II I I h II ::h : 

866 hhvhpprahpsgthdgrhegqdltyhnngnippgdinptthkkttdylsgpwlmvlvciv 925 

917 LAVLMLSFG - AMVFVKRKHMMMKQ - SALNTMRGNHTSDVLKMPSLSARNGNG YWLDSSTG 974 

I 1 1 : : I MM Mil I |: |: : I :: :|: : hi I 

926 llvlvisaaismvyfkrkhqmtkelghlsvvsdn eitalninskeslwidhhrg 979 

975 GMVWRPSPGGDSLEMQKDHIADYAPVCGAPGSPAGGGTSSGGSGGAGSGASGGDDIHGGH 1034 

II. :: : II II I : : 

980 — wr ^tadtdkd sglseskllshvn 1001 



1035 GSERNQQRYVGEYSNIP- -TDYAEVSSFGKAPSEYGRHGNASPAPYATSSILSPHQQQQQ 1092 

I: I < hi llllll : I M III: h : 

1002 ssqsn ynnsdggtdyaevdtrnlttfyncrkspdnptpyattmiigtsssetc 1054 

1093 QQ PR YQQRP VPG YGLQR PMH PHY QQQQHOQQQ AQQT HQQH - -QALQQHQQLPPSNIYQQM 1150 

: -.11 : I : : |: : : : III 
1055 tkttsisadkds-gthspysdafagqvpavpwksnylqypvepinwseflpp 1106 

1151 STTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIH ITENKLSNCHTYEAAPGAKQS 1204 

I . I I I h :: | : | : :: | : 

1107 ppehpppsstygyaqgspessrkssksagsgistnqsilnasihssssggfsa 1159 

1205 SPISSQFASVRRQQLPP NCSIGRESARFKVLNTDQGKNQQNLLDLDGSSMCY 1256 

:| hi .11 : I I::: hi I 
1160 wgvspqyava — cppenvysnplsavaggtqnryqitptnqhppqlpay 1206 

1257 NGLADSGCGGSPSPMAMLMSHEDEHALYHTA DGDLDDMERLY VKVDEQQP P 1307 

I :| lh I : : : I : I : : I III 

1207 "fattgpggavppnhlpfatqrhaaseyqaglnaarcaqsracnscdalatpspmqppp 1264 

1308 QQQQQLIPLV 1 PQHPAEGHLQSWRNQSTRSSRKNGQECIKEPSELIYAP 1355 

:h • 111:1: :| I h :: 

1265 p vpvpegwyqp vhpn s hpmhpts s nhq iy qcssecsd--hsr 1304 

1356 GSVASERSLLSNSGSGTSSQPAGHN 1380 

I = :| I " :: I lh 
1305 ssqshkrqlqleehgssakqrgghh 1329 



Y08401 

ID Y08401 standard; Protein; 1395 AA. 
xx 

AC Y08401; 
XX 

DT 24-JUL-1999 (first entry) 
XX 

DE Drosophila sp. ROB01 protein, 

XX 

KW ROBOl; ROB02; roundabout; nerve guidance; 
KW cell morphology; screening assay. 

XX 
OS 
XX 



human; murine; cell function; 



Drosophila sp. 

PN WO9920764-A1. 
XX 

PD 29-APR-1999. 

xx ;. 

PF 20-OCT-1998; 98WO-US22164. 
XX 

PR 14-NOV-1997; 97US-097U72, 

PR 20-OCT-1997; 97US-0062921. 
XX 



Best Available Copy 

Mon Jan 22 13:04:22 2001 us-09-540-245a-16,rag Page 5 



(REGC ) 0N1V CALIFORNIA. 

Goodman CS, Kidd T, Mitchell KJ, Tear G; 



WPI; 1999-312615/26. 
N-PSDB; X57250. 



PT Robo polypeptides, a new immunoglobulin superfamily member 



I 



Claim 1; Page 45-49; 80pp; English. 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C. elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides. The 
probes and primers are also useful in screening assays. 

Sequence 1395 AA; 



Query Match 24.6%; Score 1786.5; DB 20; Length 1395; 

Best Local Similarity 30.1%; Pred. No. l.le-101; 

Matches 441; Conservative 242; Mismatches 507; Indels 275; Gaps 39; 

Qy 2 ENPRIIEHPMDTTVPRNDPFTFNCQAEGNPTPTIQWFKDGRELKT-DTGSHRIMLPAGGL 60 

I I II: I II: II I Ihllll : I : Mh I I 
Db 54 qspriiehptdlvvkknepatlnckvegkpeptiewfkdgepvstnekkshrvqfkdgal 113 

Qy 61 FFLKVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVAL 120 

I : : ::| I I III III I I 1 1 : 1 : 1 1 : 1 1 1 1 1 : 1 1 : 1 1 :||||;|| || 

Db 114 ffyrtmqgkkeqdggeywcvaknrvgqavsrhaslqiavlrddfrvepkdtrvakgetal 173 

Qy 121 MECGAPRGSPEPQISWRKNG -QTLNLVGNKRIRIVDGGNLAItffiARQSDDGRYQC 174 

:HI hi Ml : I |:| j: :: : hjlllllll If |:| |:| 
Db 174 lecgppkgipeptliwikdgvplddlkamsfgassrvrivdggnllisiivepidegnykc 233 

Qy *175 WKNWGTRESATAFLKVHVRPFLIRGPQNQTAVVGSSVVFQCRIGGd|lPDVLWRRTAS 234 

: :|:|IHIIH I I |:|: :: |::| : I : I I :|||!| I III:: 
Db 234 iaqnlvgtressyaklivqvkpyfmkepkdqvmlygqtatfhcsvggdpppkvlwkk-e 291 

Qy 235 GGNMPLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVH 294 

l!:|: : II :::||:: ::| II I III I I hi I I 

Db 292 egnipvsrarilh deksleisnitptdegtyvceahnnvgqisaraslivh 342 

•295 APPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTP 354 
III I II I: I : I I 1:1:1 |:::|: II |:|: I I I 
Db 343 appnftkrpsnkkvglngwqlpcmasgnpppsvfwtkegvstlmfpnsshgrqyva— 399 

Qy 355 EGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTLP 414 

II :|| I I hi : I I : I : I : I 1 1 1 1 1 : II Mill 
Db 400 -adgtlqitdvrqedegyyv-csafswdsstvrvflqvssvderpppiiqigpanqtlp 457 

Qy 415 VKS IWLPCRTLGTP VPQVSWYLDG I PIDVQEHERRNLSDAGALT ISDLQRHEDEGLYTC 474 

I: llll I I h: I: II II I :: :| : III I I III 
Db 458 kgsvatlpcratgnpsprikwfhdgha-vqagnrysiiqgsslrvddlql-sdsgtytc 514 

Qy 475 VASNRNGKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTLSWTRS 534 

II h:lh I :: | : :: II : II! Ill I::: |::| I :| 

Db ,515 tasgergetswaatltvekpgstsl-hraadpstypappgtpkvlnvsrtsislrwaks 572 

Qy 535 NKVGGS-SLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHG 592 

: I: :h :| I : lh I :| I :|| || :| hlllh I 
Db 573 qekpgavgpiigytveyfspdlqtgwivaahrvgdtqvtisgltpgtsyvflvraentqg 632 

Qy 593 LSLPSPMSEPITVGTRYFN-SGLDLSEARASLLSGDWELSNASWDSTSMKLTWQI- 648 

:|:il h I hi III II :||:| III :|l :::::::| I : 
Db 633 isvpsglsnviktieadfdaasandlsaar-tlltgksvelidasainasavrlewmlhv 691 

Qy 649 -INGRYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTKPNIA 707 
: lllll :: : I 

Db 692 sadekyveglrihykdasvp 711 



Qy 708 AAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVPFYKSV 767 

: :| :|::: II : I hi MM: lh::: 
Db 712 ■ saqyhsitvmd-asaesfwgnlkkytkyeffltpffeti 750 

Qy 768 EGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDRHGVLLNYHVIVRGI 827 

Ihllll: I I lllll I :: : I h :::| I : hi I : I 
Db 751 egqpsnsktaltyedvpsappdniqigmynqtagwvrwtpppsqhhngnlygykiev--- 807 

Qy 828 DTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPIT 887 

= 1 I =:M:h:l : :::| III I :|:| : : II III I :| :|| I 
Db 808 -sagntmkvlanmtlnatttsvllnnlttgavysvrlnsftkagdgpyskpislfmdp-t 865 

Qy 888 KRLDP -■■ FINQRDH- -VNDVLTQPWFIILLGAI 916 

: I . II I I I: II ::|: : 

Db 866 hhvhpprahpsgthdgrhegqdltyhnngnippgdinptthkkttdylsgpwlmvlvciv 925 

Qy 917 LAVLMLSFG-AMVFVKRKHMMMKQ-SALNTMRGNHTSDVLKMPSLSARNGNGYWLDSSTG 974 

I M::| :lh llll I h h : I :: :|: : hi I 
Db 926 llvlvisaaismvyfkrkhqmtkelghlsvvsdn eitalninskeslwidhhrg 979 

Qy 975 GMVWRPSPGGDSLEMQKDHIADYAPVCGAPGSPAGGGTSSGGSGGAGSGASGGDDIHGGH 1034 

II ::: ill III:: 
Db 980 — wr tadtdkd sglseskllshvn 1001 

Qy 1035 GSERNQQRYVGEYSNIP--TDYAEVSSFGKAPSEYGRHGNASPAPYATSSILSPHQQQQQ 1092 

h I < hi llllll : I :| III: h : 

Db 1002 ssqsn ynnsdggtdyaevdtrnlttfyncrkspdnptpyattmiigtsssetc 1054 

Qy 1093 QQPRYQQRPVPGYGLQRPMHPHYQQQQHQQQQAQQTHQQH--QALQQHQQLPPSNIYQQM 1150 

: II : I : : h : : : III 
Db 1055 tkttsisadkds-gthspysdafagqvpavpvvksnylqypvepinwseflpp 1106 

Qy 1151 STTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIH ITENKLSNCHTYEAAPGAKQS 1204 

| I -I I I h :: | : | : :: | : 

1 fib 1107 ppehpppsstygyaqgspessrkssksagsgistnqsilnasihssssggfsa 1159 

Qy 1205 SPISSQFASVRRQQLPP NCSIGRESARFKVLNTDQGKNQQNLLDLDGSSMCY 1256 

h |:| '|| : | |::: hi I 
Db 1160 wgvspqyava----cppenvysnplsavaggtqnryqitptnqhppqlpay 1206 

Qy 1257 NGLADSGCGGSPSPMAMLMSHEDEHALYHTA DGDLDDMERLYVKVDEQQPP 1307 

I :| II.: I : : : I : I : : I I I 

Db 1207 --fattgpggavppnhlpfatqrhaaseyqaglnaarcaqsracnscdalatpspmqppp 1264 

Qy 1308 QQQQQLIPLV:- PQHPAEGHLQSWRNQSTRS SRKNGQEC IKEPSELIYAP 1355 

:h • 111:1: h I |: :: 

Db 1265 p vpvpegwyqpvhpnshpmhptssnhqiy qcssecsd--hsr 1304 

Qy 1356 GSVASERSLLSNSGSGTSSQPAGHN 1380 

I : :| I » :: I lh 
Db 1305 ssqshkrqlqleehgssakqrgghh 1329 



RESULT 5 " 1 .., 
Y13566 

ID Y13566 standard; Protein; 1651 AA. 
XX 

AC Y13566; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE Human Robo 1 polypeptde. 

XX 

KW Comm polypeptida; Robo polypeptide; commissureless; roundabout; 

kw modulation; nerve cell function. 

XX 

OS Homo sapiens, 
xx : 

PN W09925833-A1. 



PD 27-MAY-1999. 



Best Available Copy 
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PF 13-NOV-1998; 98WO-OS24327. 
XX 

PR 14-NOV-1997; 97US-0065543 . 
XX 
PA 
XX 

PI 

XX 



(REGC ) UNIV CALIFORNIA, 
Goodman c, Kid 1, Mitchell KJ, 



CC 

I 

CC 
XX 
SQ 



WPI; 1999-338008/28. 
N-PSDB; X55770. 



Modulation of Robo-Comm polypeptide interactions 

Disclosure; Page 44-48; 56pp; English. | 

The invention relates to a method for modulating the amount of Coram 
(commissureless) polypeptide in contact with a cell expressing active 
Robo (roundabout) on its surface. The method comprises modulating the 
effective amount of Comm polypeptide in contact with the cell, where the 
amount of expressed active Robo is specifically modulated inversely with 
the modulation of the effective amount of Coram in contact with the cell. 
The method is used to modulate the amount of active Robo expressed on a 
cell. The method can be used to screen for agents that modulate Robo:Comm 
interactions. This is particularly useful for modulating nerve cell 
function. 

Sequence 1651 AA; 



Query Match 20.6%; Score 1498; DB 20; Length 1651; 

Best Local Similarity 28.9%; Pred. No. 7.8e-84; 

Matches 410; Conservative 221; Mismatches 526; Indels 260; Gaps 42; 

Qy 4 PRI IEHPMDTTVPKNDPFTFNCQAEGNPTPT IQWFKDGRELKTDTG - - - SHRIMLPAGGL 60 

MI:IM I I I :| I 11:11! |||||:|:| I ::|| |||::||:| I 
Db 68 privehpsdlivskgepatlnckaegrptptiewykggervetdkddprshrmllpsgsl 127 

Qy 61 FFLKVIHSRR-ESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVA 119 

h I II I hi I II 1 1 : 1 : 1 1 : 1 1 1 : 1 1 |:: II II I 
Db 128 fflrivhgrksrpdegvyvcvarnylgeavshnaslevailrddfrqnpsdvmvavgepa 187 

Qy 120 LMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRI-VDGGNIAIQEARQSDDGRYQCWKN 178 

:IM III III I M : I : I h :| II : II I HI hi I I 
Db 188 vmecqpprghpeptiswkkdgspld—dkderitirggklraitytrksdagkyvcvgtn 244 

Qy 179 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

:H III I I I II :: M I | hi |||:| I!: | : 
Db *245 mvgeresevaeltvlerpsfvkrpsnlavtvddsaefkceargdpvptvrwrlt--ddgel 302 



239 PLRKFSWLHSASGRVHVLEDRSLKLDDYTLEDMGEYTCEADNAVGGITATGILTVHAPPK 298 

I I : : :lh I III III |:| I |: III II 

303 p ksryeirddhtlkirkvtagdmgsytcvaenmvgkaeasatltvqepph 352 



Qy 299 FVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLL— PGYRDGRMEVTLTPE 355 

l|::|::|:| :| I |:|:| |:|:| "I II: :|| I I |: I : 

Db 353 fwkprdqwalgrtvtfqceatgnpqpaifwrregsqnllfsyqppqsssrfsvsqtgd 412 

Qy 356 GRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSV-DTQFELPPPIIEQGPVNQTLP 414 

hi I I : I II II: :: : I I : MM HUM: 

Db 413 — ltitnvqrsdvgyyi-cqtlnvagsiitkaylevtdviadrpppvirqgpvnqtva 467 

Qy 415 VKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTC 474 

I II I hill : I lh : h : I : I I I : I I III 

Db 468 vdgtfvlscvatgspvptilwrkdgvlvstqdsrikqlen-gvlqir-yaklgdtgrytc 525 

Qy 475 VASNRNGKSSWSGYLRLD TPTNPNIKFFRAPELSTYPGPPGKPQMVEKGEN 525 

:|| :|:::|| h : Ihlh I I lh: : I 

Db 526 iastpsgeatwsayievqefgvpvqpprptdpnl ipsapskpevtdvsrn 575 

Qy 526 SVTLSWTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLI 585 

MMM : I h: Ml I MM! II I I lh 

Db 576 tvtlsw-qpnlnsgatptsyiieafshasgsswqtvaenvktetsaikglkpnaiylflv 634 



Qy 586 RAENSHGLSLPSPMSEPI-TVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKL 644 

II I : : I : I II :|:|: I : hi : : I h I I :|: hh:: 
Db 635 raanaygisdpsqisdpvktqdvlptsqgvdhkqvqre-lgnavlhlhnptvlssssiev 693 

Qy 645 TWQI-INGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTK 703 
I : :h:|: : I 

Db 694 hwtvdqqsqyiqgykilyr 712 

Qy 704 PNIAAAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVPF 763 

M lh: II Mil: || || 

Db 713 ■■■■psganhgesdwlvfevrtp aknsvvipdlrkgvnyeikarpf 754 

Qy 764 YKSVEGKPSNSRI ARTLEDVPSEAPYGMEALLL - - NS SAVFLRWKAPELKDRHGVLLNYH 821 

: :l l ; : MMM II I h I :h : h I ::h: I 
Db 755 fnefqgadseikfaktleeapsappqgvtvskndgngtailvswqpppedtqngmvqeyk 814 

Qy 822 VIVSG IDTAHNFSRI LT NVT IDAAS PTLVLANLT EGVMYTVGVAAGNNAGVGPYC VPATL 881 

I I :l I hi :: ::|: I h hi III III I : 

Db 815 vwclgnetryhi nktvdgstfsvvipflvpgirysvevaastgagsgvksepqfi 869 



Qy 



Db 



882 RLDPITKRLDP ■ • F INQRDHVNDVLTQPWFI ILLGAILAVLMLSFGAMVFVKRKHMMMKQ 939 

:H :'i :: ::lh II II :|| :::: I :: :| : 

870 qldahgnpvspedqvslaqqisdwkqpaf iagigaacwiilmvf siwly- - -rhrkkrn 926 

940 SALNTMRGNHTSDVLKMPSLS ARNGNGYWLDSSTGGMVWRPSPGGDSLEMQKD 992 

Ml: Mil : I I MM || | : 

927 gltstyag- — - - irkvpsf tf tptvtyqrggeav — ssgg- - -rpgllnisepaaqp 974 

993 HIADYAPVCGAPGSPAGGGTSSGGSGGAGSG— ■ - ASGGDDI HGGHGSERNQQRYV - — 1044 

Ml I : : 1:1 : I : I I : hi : 

975 wladtwpntgnnhndcsiscctagngnsdsnlttysrpadcianynnqldnkqtnlmlpe 1034 



Qy 1045 ■— GEYSNIPTDYAEVSSFGKAPSEYGRHGNAS-PAPYATSSILSPHQQQQQQQPRYQ 1098 

h ::' ■ h :| : II I I I HIM :: : 
Db 1035 stvygdv-dlBnkinemktfnspnLkdgrfvnpsgqptpyattqliqsnlsnnmnn — 1089 

Qy 1099 QRPVPGYGLQRPMH-PHYQQQQH— QQQQAQQTHQQHQALQQHQQLPPSNIYQQMSTT 1153 

I I ■ ; I I Ihl I ::::::: :||: I I 
Db 1090 gsgdsgekhwkplgqqkqevapvqyniveqnklnkdyrandtvpptipynqs--- 1141 

Qy 1154 SEIYPTNTGPSRSVYSEQYYYPKDKQRHIHITENKLSNCHTYEAAPGAKQSSPISSQFAS 1213 

I III I II : II II 

Db 1142 — ydqntggs ynssd rgsstsgsqg-- 1164 

Qy 1214 VRRQQLPPNCSIGRESARFKVLNTDQGKNQQNLLDLDGSSMCYNGLADSGCGGSPSPMAM 1273 

' :: || : | | :|| I I I 

Db 1165 ■'•-hkkgartpkvpkqggmnwadll ppppah 1192 

Qy 1274 LMSHEDEHALTHTADGDLDD MERLYVKVDEQQPPQQQQQLIPLVPQHPAEGHL 1326 

I : . : I I hh: || : : :: | | : 

Db 1193 ppphsnseeynisvdesydqempcpvppannylqqdeleeeedergptppvrgaasspaa 1252 

Qy 1327 QSWRNQSTRSSRKNGQE CIKEPSELIYAP 1355 

I: :||| : : II I :| : : I 

Db 1253 vsyshqstatltpspqeelqpmlqdcpeetghmqhqp 1289 



RESULT 6 
Y08404 

ID Y08404 standard; Protein; 1649 AA. 
XX 

AC Y08404; 
XX 

DT 24-JUL-1999* ' (first entry) 

XX ' , 

DE Human ROBOl protein, 

XX 

KW ROBOl; R0B02; roundabout; nerve guidance; human; murine; cell function; 
KW cell morphology; screening assay. 

XX 

OS Homo sapiens. ' ' 

XX 

PN WO9920764-A1. 



Mon Jan 22 13:04:22 2001 



us-09-540-245a-16.rag 
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xx 

PD 
XX 
PF 
XX 
PR 
PR 
XX 
PA 
XX 

PI 

XX 
DR 
DR 
XX 

PT Robo polypeptides, a new immunoglobulin superfamily -member 

xx ; 

M, Claim 1; Page 65-71; 80pp; English. J 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosbphila sp., 
C. elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides. The 
probes and primers are also useful in screening assays. 



20-OCM998; 98WO-US22164 . • 

14-NOV-1997; 97(75-0971172 . 
20-OCT-1997; 97US-0062921. 

(REGC ) ONIV CALSPORNIA. 

Goodman CS, Kidd T, Mitchell KJ, Tear C 

WPI; 1999-312615/26. 
N-PSDB; X08404. 



Sequence 1649 AA; 



Query Match 20.54; Score 1491; DB 20; Length 1649; 

Best Local Similarity 29.1%; Pred. No. 2.1e-83; 

Matches 413; Conservative 219; Mismatches 521; Indels 266; Gaps 43; 

3y 4 PR I IEHPMDTTVPKNDPFTFNCQAEGNPTPT IQWFKDGRELKTDTG - - - SHRIMLPAGGL 60 

111:111 I I I :l I ll:IM 1 1 1 M : I : I I ::|| |||::||:| ! 
Db 68 privehpsdlivskgepatlnckaegrptptiewykggervetdkddprshrmllpsgsl 127 

2y 61 FFLKVIHSRR-ESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPAHTRVAQGEVA 119 

Ml::: |: I M hi I I 1 1 : 1 : 1 1 : 1 1 1 : 1 1 |:: II II I 
Db 128 fflrivhgrksrpdegvyvcvarnylgeavshnaslevailrddfrqnpsdvmvavgepa 187 

2y 120 LMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRI-VDGGNLAIQEARQSDDGRYQCWKN 178 
:MI III III lll:|:| h :| II : II I I |:|| |:| II I 
188 vmecqpprghpeptiswkkdgspld---dkderitirggklmitytrksdagkyvcvgtn 244 



Db 



179 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

:M III I I I II :: I I II |:| III: I ||: I : 

245 mvgeresevaeltvlerpsfvkrpsnlavtvddsaefkceargdpvptvrwrk--ddgel 302 

239 PLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPK 298 

I I : :| :lh II III III hi II |: III II 

303 p ksryeirddhtlkirkvtagdmgsytcvaenmvgkaeasatltvqepph 352 

299 FVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLL---PGYRDGRMEVTLTPE 355 

ll::h:|:l :| I I : I : I hhl ::| II: :|| I I I: I : 

353 fvvkprdqvvalgrtvtfqceatgnpqpaifwrregsqnllfsyqppqsssrfsvsqtgd 412 



Qy 356 GRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSV-DTQFELPPPIIEQGPVNQTLP 414 

hi I I I : I II lh :: : I I : llhl llllllh 
Db 413 — ltitnvqrsdvgyyi-cqtlnvagsiitkaylevtdviadrpppvirqgpvnqtva 467 

Qy 415 VKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTC 474 

I II I hill : I lh : h : I : I I I : I I III 
Db 468 vdgtfvlscvatgspvptilwrkdgvlvstqdsrikqlen-gvlqir-yaklgdtgrytc 525 

Qy 475 VASNRNGKSSWSGYLRLD TPTNPNIKFFRAPELSTYPGPPGKPQMVEKGEN 525 

:|| : I : : : 1 1 h : 1 1 : 1 1 : I I lh: : I 

Db 526 iastpsgeatwsayievqefgvpvqpprptdpnl ipsapskpevtdvsrn 575 

Qy 526 SVTLSWTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLI 585 

:|IHI : I h: hll I llhl II I I lh 

Db 576 tvtlsw-qpnlnsgatptsyiieafshasgsswqtvaenvktetsaikglkpnaiylflv 634 



Qy 586 RAENSHGLSLPSPMSEPI-TVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMRL 644 

II h:hl .11 :hh I : |:| : ; I |: I I :|: hi": 
Db 635 raanaygisdpsqisdpvktqdvlptsqgvdhkqvqre-lgnavlhlhnptvlssssiev 693 

Qy 645 TWQI - INGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTK 703 

I : :h:h : I 

Db 694 hwtvdqqsqyiqgykilyr 712 

Qy 704 PNIAMGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVPF 763 

: 1.1:: || :| | | : || 

Db 713 ----psganhgesdwlvfevrtp aknswipdlrkgvnyeikarpf 754 

Qy 764 YKSVEGKPSNSRIARTLEDVPSEAPYGMEALLL-NSSAVFLKWKAPELKDRHGVLLNYH 821 

: :| I .,: hllh II I h I :|: : h I ::|:: I 
Db 755 fnefqgadseikfaktleeapsappqgvtvskndgngtailvswqpppedtqngmvqeyk 814 

Qy 822 VIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATL 881 

I I :| ::: I hi :: ::|: I h hi III III I : 
Db 815 vwclgnetryhi nktvdgstfsvvipflvpgirysvevaastgagsgvksepqfi 869 

Qy 882 RLDPITKRLDP--FINQRDTOVLTQPWFIILLGAIIAVmSFGAMVFVKRKHMMMXQ 939 

:M ■ : J :: ::ll: II II =11 :::: I :: :| : 
Db 870 qldahgnpvspedqvslaqqisdwkqpafiagigaacwiilmvfsiwly-rhrkkrn 926 

Qy 940 SALNTMRGNHTSDVLKMPSLS ARNGNGYWLDSSTGGMVWRPSPGGDSLEMQKD 992 

:| I - : |:|l : I I hll II I : 

Db 927 g ltstyag ----- irkvps f tf tptvtyqrggeav — ssgg - - - rpgl In isepaaqp 974 

Qy 993 HIADYAPVCGAPGSPAGGGTSSGGSGGAGSG — ASGGDDIHGGHGSERNQQRYV — 1044 

:| I I I • : : hi : I : I I : hi : 
Db 975 wladtwpntgnnhndcsiscctagngnsdsnlttysrpadcianynnqldnkqtnlmlpe 1034 

Qy 1045 — GEYSNIPTDYAEVSSFGKAPSEYGRHGNAS- -PAPYATS- -SILSPHQQQQQQQPR 1096 

h :: h :| : II I I I lllh I II : 
Db 1035 stvygdv-dlsnkinemktfnspnlkdgrfvnpsgqptpyattiqsnlsnnmnn 1087 

Qy 1097 YQQRPVPGYGLQRPMH- - PHYQQQQH- - -QQQQAQQTHQQHQALQQHQQLPPSNIYQQMS 1151 

II II 11:1 I ::::::: :||: I I 
Db 1088 gsgdsgekhwkplgqqkqevapvqyniveqnklnkdyrandtvpptipynqs- 1139 

Qy 1152 TTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIHITENKLSNCHTYEAAPGAKQSSPISSQF 1211 

Mill II : II II 

Db 1140 ydqntggs ynssd rgsstsgsqg 1162 

Qy 1212 ASVRRQQLPPNCSIGRESARFKVLNTDQGKNQQNLLDLDGSSMCYNGLADSGCGGSPSPM 1271 

• :: II : I I :|| II 
Db 1163 ;---hkkgartpkvpkqggmnwadll pppp 1186 

Qy 1272 AMLMSHEDEHALYHTADGDLDD MERLYVKVDEQQPPQQQQQLIPLVPQHPAEG 1324 

I h 'i :| I hi:: || : : :: | | ; 

Db 1189 ahppphsnseeynisvdesydqempcpvpparmylqqdeleeeedergptppvrgaassp 1248 



Qy 1325 HLQSWRNQSTRSSRKNGQE CIKEPSELIYAP 1355 

I: :Mli: : II I :| : : I 

Db 1249 aavsyshqstatltpspqeelqpmlqdcpeetghmqhqp 1287 



RESULT 7 
Y13565 

ID Y13565 standard; Protein; 1297 AA. 
XX 

AC Y13565; 
XX 

DT 30-JDL-1999 (first entry) 
XX 

DE C. elegans Robo^polypeptde. 
XX 

KW Coma polypeptide; Robo polypeptide; commissureless; roundabout; 

KW modulation; nerve cell function, 

XX 

OS Caenorhabditis alegans. 
XX 
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PN W09925833-A1. 
XX 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; 98WO-US24327 . 
XX 

PR 14-NOV-1997; 97US-0065543. 
XX 

PA (iREGC ) DNIV CALIFORNIA. 
XX 

PI Goodman c, Kid tT Mitchell RJ, Russell C, Tear G; 
XX 

DR DPI; 1999-338008/28, i 

DR N-PSDB; X55769. 

n 

PT Modulation of Robo-Comm polypeptide interactions 
XX 

PS Disclosure; Page 38-39; 56pp; English. 
XX 

CC The invention relates to a method for modulating the amount of Comm 

CC (commissureless) polypeptide in contact with a cell expressing active 

mk Robo (roundabout) on its surface. The method comprises modulating the 

■ effective amount of Comm polypeptide in contact with the cell, where the 

^C amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Comm in contact with the cell. 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo: Comm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function. 

XX 

SQ Sequence 1297 AA; 



Query Match 18.5%; Score 1344. 5; DB 20; Length 1297; 

' Best Local Similarity 26.8*; Pred. No, 1.6e-74; 
Matches 386; Conservative 216; Mismatches 486; Indels 353; Gaps 46; 

Qy ;4 PRI IEHPMDTTVPKNDPFTFNCQAEGNPTPT IQWFKDGREL — KTDTGSHRIMLPAGGL 60 

I INN: I : I I II I: : I I hill: : I . 1 1 1 1 : 1 I I 
Db 30 pviiehpidvvvsrgspatlncgakps-takitwykdgqpvitnkeqvhshrivldtgsl 88 

Qy 61 FFLKVIHSR - - RESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEV 118 

I III : -Mil hi I II I :| : I : : 1 : 1 1 : : 1 1 : I : II: 
Db 89 fllkvnsgkngkdsdagayycvasnehgevksnegslklamlredfrvrprtvqalggem 148 

Qy 119 ALMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVDGGNIAIQEARQSDDGRYQCWKN 178 
Mil III III Mill: : I : I : III I :|| I Ml I 

Db 149 avlecspprgfpepwswrkddkelriqdmprytlhsdgnliidpvdrsdsgtyqcvann 208 




179 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

:M I I I I I :| : h: I ll::|:| lh III I : |:| I 
209 mvgervsnparlsvfekpkf eqepkdmtvdvgaavlf dcrvtgdpqpqitwkr- -knepm 266 



Qy 239 PLRKFSWLHSASGRVHVLED-RSLRLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPP 297 

I: I :: :l I M: I I 1 1 1 I I I I : I : I I 1 1 1 

Db 267 pvt rayiakdnrglriervqpsdegeyvcyarnpagtleasahlrvqapp 316 

Qy 298 KFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGy-RDGRMEVTLTPE 355 

I :| :l I I III I I I :!! II II I I III :|: I 

Db 317 sfqtkpadqsvpaggtatfectlvgqpspayfwskegqqdllfpsyvsadgrtkvspt- 374 

Qy 356 G RSVLS I ARFAREDSGRWTCNALNAVGSVSSRT WSVDTQFEL 399 

1:1 : I I I I :!: II I: :: II 

Db 375 - -gtltleevrqvdegayv-cagmnsagsslsk- -aalkatf etkgrvqkkkskmgkqkq 429 

Qy 400 PPPIIEQGPVNQTLPVKSIWLPCRTLGTPVPQVSWYLD 438 

III II I llll I I :IM: I I I :|| I 

Db 430 knvqsiikylisavtgntpakppptiehghqnqtlmvgssailpcqasgkptpgiswlrd 489 

Qy 439 G I P I DVQEHERRNLSDAGRLT I SDLQRHEDEGLYTCVASNRNGKS SWSG YLRLDT PTNPN 498 

Mil: : I : M MM I MIM I MIM I :: I: I 

Db 490 glpiditd-srisqhstgslhiadlkk-pdtgvytciaknedgestwsasltvedhts-n 546 



Qy 499 IRFFRAPELSTYPGPPGRPQMVERGENSVTLSWTRSNRVGGSSLVGYVIEMFGRNETDGW 558 

: I |: Ml MMI : I I I : I : MM : : I 
Db 547 aqfvrmpdpsnfpssptqpiivnvtdtevelhwnapstsgagpitgyiiqyyspdlgqtw 606 

Qy 559 VAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGL 614 

: I :M II I :l MINI |: II I :| I 
Db 607 fnipdyvasteyrikglkpshsymfviraenekgigtpsvssalvttskpaaqvalsdkn 666 

Qy 615 --DLSEARASLLSGDWELSNASWDSTSMKLTWQIIN-GKYVEGFYVYARQLPNPIVNN 671 

hM II :::| MM:I I: : ::|:|: I 
Db 667 kmdmaiaekrltseqlikleevktinstavrlfwkkrkleelidgyyikwr 717 

Qy 672 PAPVTSNTNPLLGSTSTSASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPLNTRY 731 
I :| I : II I 

Db 718 -gpprtndnqyvnvtsps 734 

Qy 732 RMLT ILNGGGASSCT ITGLVQYTLYEFFI VPFYR - - - SVEGRPSN SRI ARTLEDVPSEAP 788 

• : :: I: :| lll|::|:: |: I Ml I I II I 
Db 735 tenywsnlmpftnyeffvipyhsgvhsihgapsnsmdvltaeappslpp 784 

Qy 789 YGMEALLLNSSAVFLKWKAPELKDRHG VLLNYHVI VRG IDTAHNFSR I LT NVT I DAAS PT 848 

: :ll.': : : lllh :|:| : ::: I \ | :| |:| : : : 
Db 785 edvrirmlnlttlriswkapkadgingilkgfqivivg-qapnnnr---nittneraas 839 

Qy 849 LVLANtTEGVMYTVGVAAGNNAGVGPY--CVPATLRLDPITKRL DPFIN — QR 897 

: I :| I:.'l : III : I III • : M I I ' \- 
Db 840 vtlfhlvtgmtykirvaarsnggvgvshgtsevimnqdtlekhlaaqqenesflyglink 899 

Qy 898 DHVNDVLTQPHFIILLGAILAVLMLSFGAMVFVR RRHMMMRQSALNTMRGN 948 

II ■ :|:: III : :: I : : I : : ::: I I 

Db 900 shvp "vivivailiifwiiiaycywrnsrnsdgkdrsfikindgsvh-masn 950 

Qy 949 HTSDVLKMPS LSARNGNGYWLDSST GGMWRPSPGGD 985 

: IM I:'. :: I II I I I M II 

Db 951 nlwdvaqnpnqnpmyntagrmtmnnrngqalysltpnaqdffnncddysgtmhrpg"" 1006 

Qy 986 SLEKQRDHIADYAPVCGAPGSPAGGGTSSGGSGGAGSGASGGDDIHGGHGSERNQQRYVG 1045 
:| II : I II: 

Db 1007 sehhyliyaqltggpgn 1022 

Qy 1046 EYSNIPTDYAEVSSFGRAPSEYGRHGNASPAPYATSSILSPHQQ QQQQQPRYQQ 1099 

Ml II : 1:1111:::: M : : I 
Db 1023 -.amstf ygnqyhddpspyatttlvlsnqqpawlndkmlrapampt 1066 

Qy 1100 RPVPGYGLQRI'MHP HYQQQQHQQQQAQQ THQQHQALQQHQQLPPSNI 1146 

III I I | :: : :| I : |: I ::: 

Db 1067 npvp -pepparyadhtagrrsrssrasdgrgtlngglhhrtsgsqrsdspphtdv 1120 

Qy 1147 •YQQMSTTSEIYPTNTGPSRSVYSEQYYYPKDRQRHIHITENRL SN 1191 

I I: :: II h h I II II 

Db 1121 syvqlhssd gtgsskertgerrtpp nktlmdfippppsnppppg 115! 

Qy 1192 CHTYEAAPGA- - -KQSSPISSQFAS VRRQQLPPNCSIGRESARFKVLNTD 1233 

I I: I . : hi : I I : |::| : I I 

Db 1165 ghvydtatrrqlnrgstpredtydsvsdgafarvdvnarptsrnrnlggrplkgk--rdd 1222 

Qy 1239 QGRNQQNLLDLDGSSMCYNGLADSG CGGSPSPMAMLMSHEDEHALYH 1285 

M Ml 1:1 

Db 1223 dsqrsslmmdddggsseadgensegdvprggvrkavprmgisastla hscyg 1274 

Qy 1286 T 1286 

I 

Db 1275 t 1275 



RESULT 8 
Y08403 

ID Y08403 standard; Protein; 1297 AA. 
XX 

AC Y08403; 
XX 

DT 24-JUL-1999 (first entry) 
XX 
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C. elegans ROBO protein. 

R0B01 ; R0B02 ; roundabout; nerve guidance; human; murine; ceil function; 
cell morphology; screening assay, 

Caenorhabditis elegans. 

WO9920764-A1. 

29-APR-1999. 

20-OCT-1998; 98WO-US22164 . 



14-NOV-1997; 
20-OCT-1997; 



97US-0971172. 
970S-0062921. 



(REGC ) ONIV CALIFORNIA. 

Goodman cs, Ridd T, Mitchell KJ ( Tear G; 



WPI; 1999-312615/26. 
N-PSDB; X57252. 

Robo polypeptides, a new immunoglobulin superfamily member 
Claim 1; Page 59-63; 80pp; English. 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp,, 
C, elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides. The 
probes and primers are also useful in screening assays. 

Sequence 1297 AA; 



Query Match 18,5%; Score 1344.5; DB 20; Length 1297; 

Best Local Similarity 26.8%; Pred. No. 1.6e-74; 

Matches 386; Conservative 216; Mismatches 486; Indels 353; Gaps 46; 

4 PRIIEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGREL — KTDTGSHRIMLPAGGL 60 

I llllhl I : II II I: : I I hill: : I HIM I I 
30 pviiehpidvvvsrgspatlncgakps-takitwykdgqpvitnkeqvnshrivldtgsl 88 

61 FFLKVIHSR-RESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEV 118 

I III : ::IHI hi I II I :| : I : : 1 : 1 1 : : 1 1 : I : II: 
89 fllkvnsgkngkdsdagayycvasnehgevksnegslklamlredfrvrprtvqalggem 148 



Qy 



Db 



Qy 1 119 ALMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGR?QCWKN 178 

|::M III Ul :MII: : I : I : III I :|| I III I 
Db 149 avlecspprgfp'epvvswrkddkelriqdmprytlhsdgnliidpvdrsdsgtyqcvann 208 

Qy 179 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

:|| I I I I I :| : h: I lh:|:l II: III I : hi I 
Db 209 mvgervsnparlsvfekpkfeqepkdmtvdvgaavlfdcrvtgdpqpqitwkr-knepm 266 

Qy 239 PLRKFSWLHSASGRVmED-RSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPP 297 

* h I :: :| I |::: I II 1 1 I I I I : I : II 1 1 1 

Db 267 pvt rayiakdnrglriervqpsdegeyvcyarnpagtleasahlrvqapp 316 

Qy 298 KFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGY--RDGRMEVTLTPE 355 

I :l :| I I III I I I :ll II II II III :|: I 
Db 317 sfqtkpadqsvpaggtatfectlvgqpspayfwskegqqdllfpsyvsadgrtkvspt- 374 

Qy 356 GRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFEL 399 

hi : I I I I :h II h :: II 
Db 375 --gtltieevrqvdegayv-cagmnsagsslsk--aalkat;fetkgrvqkkkskmgkqkq 429 

Qy 400 PPPIIEQGPVNQTLPVKSIWLPCRTLGTPVPQVSWYLD 438 

III II I llll I I :|!h MM I 
Db 430 knvqsiikylisavtgntpakppptiehghqnqtlmvgssailpcqasgkptpgiswlrd 489 



Qy 439 GIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRNGKSSWSGYLRLDTPTNPN 498 

hill: : I : hi hlh: I hllhl I :|:|:|| I :: hi 
Db 490 glpiditd-srisqhstgslhiadlkk-pdtgvytciaknedgestwsasltvedhts-n 546 

Qy 499 IKFFRAPELSTYPGPPGKPQMVEKGENSVTLSWTRSNKVGGSSLVGYVIEMFGKNETDGW 558 

:| I I: I :| I :| :| =111 : I : Ihh : : I 
Db 547 aqfvrmpdpsnfpssptqpiivnvtdtevelhwnapstsgagpitgyiiqyyspdlgqtw 606 

Qy 559 VAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGL — 614 

: I :l " II I :| MM |: II I :| I 
Db 607 fnipdyvasteyrikglkpshspfviraenekgigtpsvssalvttskpaaqvalsdkn 666 

Qy 615 "DLSEARASLLSGDWELSNASWDSTSMKLTWQIIN-GKYVEGFYVYARQLPNPIVNN 671 

h: I r.l :::| ::||:::| h : ::|:|: I 
Db 667 kmdmaiaekrltseqlikleevktinstavrlfwkkrkleelidgyyikwr 717 

Qy 672 PAPVTSNTNPLLGSTSTSASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPLNTKY 731 
I :| I •: II I 

Db 718 -gpprtndnqyvnvtsps 734 

Qy 732 RMLTILNGGGASSCTITGLVQYTLYEFFIVPFYK- - -SVEGKPSNSRIARTLEDVPSEAP 788 

. : :: h :| llll::|:: h I llll II II I 
Db 735 tenyvvsnlmpftnyeffvipyhsgvhsihgapsnsmdvltaeappslpp 784 

Qy 789 YGMEALLLNSSAVFLKWKAPELKDRHGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPT 848 

: :|| : : : lllh :|:| : ::: I I I :| hi : : : 
Db 785 edvrirmlnlttlriswkapkadgingilkgfqivivg--qapnnnr---nittneraas 839 

Qy 849 LVLANLTEGVMYTVGVAAGNNAGVGPY--CVPATLRLDPITKRL DPFIN — QR 897 

: I :l \'.A : III :l III : I : I I : |: : 

Db 840 vtlfhlvtgmtykirvaarsnggvgvshgtsevimnqdtlekhlaaqqenesflyglink 899 

Qy 898 DHVNDVLTQPWFI ILLGAI LAVLMLSFGAMVFVK RKHMMMKQSALNTMRGH 948 

II : :h: III : :: I : : I II 

Db 900 shvp ■;-vivivailiifvviiiaycywrnsrnsdgkdrsfikindgsvh-masn 950 

Qy 949 HTSDVLKMPS LSARNGNGYWLDSST GGMVWRPSPGGD 985 

: II : h- :: I II I I I I : II 

Db 951 nlwdvaqnpnqnpmyntagrmtmnnrngqalysltpnaqdffnncddysgtmhrpg — 1006 

Qy 986 SLEMQKDHIADYAPVCGAPGSPAGGGTSSGGSGGAGSGASGGDDIHGGHGSERNQQRYVG 1045 
:| 'II : I lh 

Db 1007 sehhyhyaqltggpgn 1022 

Qy 1046 EYSNIPTDYAEVSSFGKAPSEYGRHGNASPAPYATSSILSPHQQ QQQQQPRYQQ 1099 

•:|:l II : hlllh::: :|| : : I 
Db 1023 amstf ygnqyhddpspyatttlvlsnqqpawlndkmlrapampt 1066 

Qy 1100 RPVPGYGLQRPMHP HYQQQQHQQQQAQQ THQQHQALQQHQQLPPSNI 1146 

III M I :: : =1 I : h I ::: 

Db 1067 npvp pepparyadhtagrrsrssrasdgrgtlngglhhrtsgsqrsdspphtdv 1120 

Qy 1147 •YQQMSTTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIHITENKL SN 1191 

II::: ' II h h I II II 

Db 1121 syvqlhssd----gtgsskertgerrtpp nktlmdfippppsnppppg 1164 

Qy 1192 CHTYEAAPGA- --KQSSPISSQFAS VRRQQLPPNCSIGRESARFKVLNTD 1238 

I h I : : hi : I I : I "I M I 

Db 1165 ghvydtatrrqlnrgstpredtydsvsdgafarvdvnarptsrnrnlggrplkgk--rdd 1222 

Qy 1239 QGKNQQNLLDLDGSSMCYNGLADSG CGGSPSPMAMLMSHEDEHALYH 1285 

: ::l:ll I :| I I I I :| hi 
Db 1223 dsqrsslmmdddggsseadgensegdvprggvrkavprmgisastla hscyg 1274 

Qy 1286 T 1286 *' 

Db 1275 t 1275 ■ 



W8 3927 

ID W83927 standard; Protein; 753 AA. 
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xx 

AC W83927; 
XX 

DT 01-MAR-1999 (first entry) 
XX 

DE Human T85 protein, 

XX 

KW T85; FHMB-6D4; FMHV-SD4; human; neurological disorder; therapy; 

KW diagnosis. 

XX 

OS Homo sapiens, 



XX 

FH Key Location/Qualifiers 

ft Peptide 1..20 

FT /label- Sig_peptide 

FT Protein 21.. 753 

FT /label" Mat_protein 

FT Region 525,, 610 

FT /note- "has homology to a fibronectin type III 

FT domain" 

FT Region 638.. 727 

H /note- "has homology to a fibronectin type ill 

■ domain" 

Wt Region 43 ,.101 

FT /note- "has homology to a Ig superfamily domain" 

FT Region 145.. 203 

FT /note- "has homology to a Ig superfamily domain" 

FT Region 237.. 298 t 

FT /note- "has homology 'to a Ig superfamily domain" 

FT Region 329.. 394 

FT /note- "has homology»to a Ig superfamily domain" 

FT Region 433.. 491 

ft /note- "has homology! to a Ig superfamily domain" 

FT Peptide 247.. 249 

FT /note- "RGD motif" 

FT Domain 516.. 600 

ft /note- "cytokine receptor homology N-terminal 

FT domain" 

XX 



PN WO9848051-A2. 
XX 

PD 29-OCM998. 
XX 

PF 17-APR-1998; 98WO-US07714. 
XX 

PR 10-OCM997; 97US-0062017. 

PR 18-APR-1997; 97OS-0044746. 

XX ; 

PA (MILL-) MILLENNIUM BIOTHERAPEDTICS INC, 

Holtzman D, McCarthy SA; 

DR WPI; 1999-024021/02. 

DR fJ-PSDB; V69278. 
XX 

PT New isolated humafP FTHMA-070 and T85 proteins • used to develop 

PT products for the diagnosis and therapy of disorders involving 

PT cellular processes, e.g. neuronal development. 

XX 

PS Claim 31; Fig 3; 127pp; English. 
XX 

CC This is the amino acid sequence of a novel human protein designated 

CC T85, and also referred to as FMHB-6D4 and FMHB-SD4 , T85 cDNA (see 

CC V69278) was identified in a human foetal brain cDNA library using a 

CC screen designed to identify genes encoding proteins having a 

CC functional signal sequence. T85 nucleic acids and polypeptides of 

CC the invention are useful as modulating agents in regulating a 

CC variety of cellular processes. They can be used for identifying 

CC compounds which bind to or modulate the activity of the polypeptides 

CC (claimed). They can also be used in screening assays, detection 

CC assays (e.g. chromosomal mapping, tissue typing, forensic biology), 

CC predictive medicine (e.g. diagnostic assays, prognostic assays, 

CC monitoring clinical trials, and pharmacogenomi'cs), and methods of 



CC treatment (e.g. '.therapeutic and prophylactic) e.g. for neurological 

CC disorders . 

XX 

SQ Sequence 753 AA; 



Query Match J 17.3%; Score 1261; DB 20; Length 753; 
Best Local Similarity 35.0%; Pred. No. le-69; 

Matches 285; Conservative 129; Mismatches 284; Indels 116; Gaps 21; 

Oy 4 PRIIEHPMDTIVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTG- ■ -SHRIMLPAGGL 60 

111:111 I I I :| I Ihlll illlhl: I I 
Db 29 privehpsdlivskgepatlnckaegrptptiewykggervetdkddprshrmllpsgsl 88 

Qy 61 FFLKVIHSRR-ESDAGTrWCEAKNEFGVARSRNATLOVAVLRDEFRLEPANTRVAQGEVA 119 

II::: I:. INI: Ml IMIMIMI M II II I 
Db 89 fflrivhgrksrpdegvyvcvarnylgeavshnaslevailrddfrqnpsdvmvavgepa 148 

Qy 120 LMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRI-VDGGNLAIQEARQSDDGRYQCWKN 178 

M III l|| |||:|:| |: :| II MM I Ml |:| II I 
Db 149 vmecqpprghpeptiswkkdgspld— dkderitirggklmitytrksdagkyvcvgtn 205 

Qy 179 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

ill III I II II :M I II |:| III: II: I : 
Db 206 mvgeresevaeltvlerpsfvkrpsnlavtvddsaefkceargdpvptvrwrk-ddgel 263 

Qy 239 PLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHA- - - 295 

I I : :l :ll: II III III hi II I: III 

Db 264 p :ksryeirddhtlkirkvtagdmgsytcvaenmvgkaeasatltvqvgse 313 

Qy 296 PPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSStLL- ■ -PGYRDGRMEVTL 352 

II IIMM: :| | |:|:| |:|:| ::| ||; :|| | | |: 

Db 314 pphfwkprdqvvalgrtvtfqceatgnpqpaifwrregsqnllfsyqppqsssrfsvsq 373 

Qy 353 TPEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSV-DTQFELPPPIIEQGPVNQ 411 

I : hi' II I M II II: =: M I : ll|:| MM 
Db 374 tgd----ltitnvqrsdvgyyi*cqtlnvagsiitkaylevtdviadrpppvirqgpvnq 428 

Qy 412 TLPVKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGL 471 

I: I IM 1:111 : I ||: : |: M M II : II 
Db 429 tvavdgtfvlscvatgspvptilwrkdgvlvstqdsrikqlen-gvlqir-yaklgdtgr 486 

Qy 472 YTCVASNRNGKSSWSGYLRLD TPTNPNIKFFRAPELSTYPGPPGKPQMVEK 522 

111:11 :|:::H |: : ll:||: I I ||:: : 

Db 487 ytciastpsgeatwsayievqefgvpvqpprptdpnl ipsapskpevtdv 536 

Qy 523 GENSVTLSWTKSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYF 582 

IMIII ; I I:: hll I I h I III 

Db 537 srntvtlsw-qpnlnsgatptsyiieafshasgsswqtvaenvktetsaikglkpnaiyl 595 

Qy 583 FLIRAENSHGLSLPSPMSEPI-TVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTS 641 

11:11 l::h'l II :|:|: I : |:| : : I I : I I : I : |:| 
Db 596 flvraanaygisdpsqisdpvktqdvlptsqgvdhkqvqre-lgnavlhlhnptvlssss 654 

Qy 642 MKLTKQI-INGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALI 700 

::: I : ,:|::|: : I 
Db 655 ievhwtvdqqsqyiqgykilyr 676 

Qy 701 STKPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFI 760 

M ||:: || :| I I : II 

Db 677 psg&nhgesdwlvfevrtp aknswipdlrkgvnyeika 715 

Qy 761 VPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEAL 794 

II: :l I : MM: : : I 
Db 716 rpffnefqgadseikfaktleegkydeafdfhal 749 



RESULT 10 
R13144 

ID R13144 standard; Protein; 1728 AA. 
XX 

AC R13144; 
XX 
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DT 04-OCT-1991 (first entry) 

XX 

DE Deleted in Colorectal Carcinomas. 

XX 

KW DCC gene; cancer; diagnosis; antibodies ; tumor igenes is; neoplasm. 

XX 

OS Homo sapiens. 
XX 

fh Key Location/Qualifiers 

FT Protein 202.. 1648 

FT /label- DCC 

FT Peptide 202.. 227 

FT /label- sig_peptide 

FT Protein 228.. 1648 

FT /label- mat_protein 

XX 

PN WO9109964-A. 



ll-JUL-1991. 



PF 19-DEC-1990; 90WO-US07314. 
XX 

PR 04-JAN-1990; 90DS-0460981. 
XX 

PA (OYJO ) JOHNS HOPKINS UNIV. 
XX 

PI Vogelstein B; 
XX 

DR WPI; 1991-222913/30. 

DR N-PSDB; Q12752. 
XX 

PT Human DCC gene, deleted in colorectal carcinoma * and diagnosis 

PT or prognosis of neoplasms by detecting loss of gene function or 

PT expression prods., or mutation(s) 
XX 

PS Claim 44; Page 31; 51pp; English. 
XX 

CC Cells transformed with the wild-type DCC gene can be used as model 

CC systems to study cancer remission and drug therapy. DCC polypeptide 

CC expression prods, may be used to reverse the neoplastic state. 

CC X1615 represent an amino acid illegible in the specification, all 

CC other Xs are encoded by stop codons, 

CC See also Q12752-55. 
XX 

SQ Sequence 1728 AA; 



^Query Match 7.7%; Score 557.5; DB 12; Length 1728; 

■ Best Local Similarity 21.9%; Pred, No. 6.8e-26; 
Matches 314; Conservative 176; Mismatches 537; Indels 405; Gaps 65; 

Qy 5 RIIEHPMDTTVPKNDPFTFNCOAEGN-PTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFL 63 

I : II : :l II : I 1:1 III I : I I I 
Db 242 rflsepsdavtmrggnvlldcsaesdrgvpvikwkkdgihlalgmderkqqlsngslliq 301 

Qy 64 KVIHSR - RESDAGTYWCEAK -NEFGVARSRNATLQVA- VLRDEFRLEPANTRVAQGEVAL 120 

::IN : I I I III : I II I : II II I : : |: I 
Db 302 nilhsrhhkpdeglyqceaslgdsgsiisrtakvavagplr-flsqtesvtafmgdtvl 359 

Qy 121 MECGAPRGSPEPQISWRKNGQTLN-LVGNKRIRIVDGGNLAIQEARQSDDGRYQCWKNV 179 

::| II I I Ml I I : h h :: I I I ' I I hi :| 
Db 360 lkcev-igepmptihwqknqqdltpipgdsrvwlpsgalqisrlqpgdigiyrcsarnp 418 

Qy 180 VGTRESATAFLKV HVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTA 233 

:| I ::: I : : :: I I I: I I :| : I I I II 
Db 419 assrtgneaevrilsdpglhrqlyflqrpsnwaiegkdavleccvsgypppsftwlrge 478 

Qy 234 SGGNMPLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTV 293 

: :hl :l :l : Ml :| I III l:|: III 

Db 479 eviqlrskkys llggsnllisnvtdddsgmytcwtyknenisasaeltv 528 

Qy 294 HAPPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLT 353 
III: II :: III :| I II: I I: :::| :: 



Db 529 lvppwf lnhpsnlyayesmdief ectvsgkpvptvnwmkngd- -vvip- - -sdyf qiv - - 581 

Qy 354 PEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTL 413 

I I II' : I I Mil:: | :| M I : 

Db 582 --ggsnlrilgwksdeg-fyqcvaeneagnaqt saqlivpkpaipsssvlpsa 632 



Qy 414 PVKSIWLPCRTLGTPVPQVSWYLDGIPIDV QEHERR NLSDAGA 457 

I Ml . : : ::ll I : :| M Ml: 

Db 633 prdvvpvl- -;- -vssrf vrlsw- - -rppaeakgniqtf tvf f sregdnreralnttqpgs 685 

Qy 458 "LTISDLQRHEDEGLYT-CVASNRNGKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGP 513 

II: :| ; I : 1 1 III) :: II III III 

Db 686 lqltvgnl--'-kpeamytfrwaynewgpge ssqpikvatqpelqvpgp 731 

Qy 514 PGKPQMVEKGENSVTLSWTRSNKVGGSSLVGYVIEMF GKNETDGWVAVGTRVQN 567 

Ml::: I Ml :! ||: I 
Db 732 venlqavstsptsilitweppayang-pvqgy-rlfctevstgkeq nievdg 781 

Qy 568 TTFTQTGLLPGWFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGD 627 

:: II i I I I :| I III I l|: :: 

Db 782 lsykleglkk.f teyslrf laynryg- - -pgvstdditvvt lsdvpsappqnv 830 



Qy 628 WELSNASWDSTSMKLTW QIINGKYVEGFYV 659 

: II: |:|::| II :: |: : 

Db 831 sle wnsrs ikvswlpppsg tqng - f i tgyk irhr k ttr r gemetlepnn 1 wy 1 f 884 

Qy 660' YARQLPNPIVNNPAP VTSNT'-NPL LGST ST S ASASASASAL I S 701 

I: .1:111 IM I I : : : :| 

Db 885 tglekgsqysfqvsamtvngtgppsnwytaetpendldesqvpdqpsslhvrpqtnciim 944 

Qy 702 T KPNIAAAGKRDG ET NQSG 720 

: III. I I II Ml 

Db 945 swtpplnpniwrgyiigygvgspyaetvrvdskqryysierlessshyvislkafnnag 1004 

Qy 721 GGAP- •— TPL 727 

II. II: 
Db 1005 egvplyesattrsitdptdpvdyypllddfptwpdlstpmlppvgvqavalthdavrvs 1064 

Qy 728 '-NTKYRMLTI--LNGGGAS SCTITGLVQYTLYEFFIVPF 763 

:: I: I: II II III Ml :: 

Db 1065 wadnsvpknqktsevrlytvrwrtsfsasakyksedttslsytatglkpntmyefsvmvt 1124 

Qy 764 YKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSS ■ -AVFLKWKAPELKDRHGVLLNYH 821 

II I I I I: II : || MM : :M I 
Db 1125 knrrsstwsmsahattyeaaptsapkdftvitregkpravivswqpp-leangkitayi 1182 

Qy 822 VIVRGIDTAHNFSRILTNV 1IDAASPTLVLANLTEGVMYTVGVAAGNNAGVG 873 

: I : I: II I : :| II M I: III 

Db 1183 1 ( -fytldknipiddwimetisgdrlthqimdlnldtmyyfriqarnskgvg 1232 

Qy 874 PYCVP— ATLRLD PITKRL DPFINQRDHVNDVLT 905 

I I II::: I: I :| I I : :l 

Db 1233 plsdpilfrt.lkvehpdkmandqgrhgdggywpvdtnlidrstlneppigqmhpphgsvt 1292 

Qy 906 QP WIILLGAILAVLMLSFGAMVMRKHMMMKQSALNTMRGNH--TSDVLKMP 957 

I:: :: ||:: |:: :| :: I : IM 

Db 1293 pqknsnllvi'ivvtvgvitvlvvvivavictrrssaqqrkkrathsagkrkgsqkdlrpp 1352 

Qy 958 SLSARNGNGYWLDSSTGGM- -VWRPS- - - PGGDSLEMQKDHIADYAPVCGAPGSPAGGGT 1012 

I 1: I : :|l I I :| III: I 
Db 1353 dl wihheememkniekpsgtdpagrdspiqs--cqdltpvshsqsetqlgsk 14C2 

Qy 1013 SSGGSG — GAGSGASGGDDIHGGHGSERNQQRYVGEYSNIPTDY AEVSSFGKA 1063 

I: II ' III I : M :l : III I lh 
Db 1403 stshsgqdteeagssms---tlerslaarraprrkl — mipmdaqsnnpavvsaipvp 1455 

Qy 1064 PSEYGRHGNA3PAPYATSSILSPHQQQQQQQPRYQQRPVP GYGLQRPMHPHY 1115 

I :: 1:1 : II I:: llll hi I 
Db 1456 tlesaqypgilpsp----tcgyph pqftlrpvpfptlsvdrgfgagr 1498 

Qy 1116 QQQQHQQQQAQQTHQQHQALQQHQQLPPSNIYQQMSTTSEIYPTNTGPSRSV 1167 

' I: : II llll I Ml Ml: I 
Db 1499 3qsvsegpttqqppmlpps---qpehssseeapsrtiptacv 1537 
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Jp In colorectal carcinoma; antibody; 



RESULT 11 
R68553 

ID R68553 standard; Protein; 1447 M. 
n 

AC R68553; 
XX 

DT 05-JUL-1995 (first entry)! 

xx i 

DE Deleted in colorectal carcinoma (DCC) . 
XX I 

KW Tumour suppressor; deleted) in 

KW cancer diagnosis. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT Misc-difference 1. .1063 

FT /note- "DCC epitope" 

FT Misc-difference 3369.. 4341 
A /note- "DCC epitope" 

Wt Misc-difference 26.. 1126 
tT /npte- "DCC epitope on extracellular domain" 

FT Misc-difference 1123.. 1447 

FT /note- "DCC epitope on intracellular domain" 

XX 

PN W09428161-A. 
XX 

PD 08-DEC-1994. 
XX 

PF 18-MAM994; 94WO-OS05277. 
XX 

PR 26-MAM993; 93US- 0068950 . 
XX 

PA (UYJO ) DNIV JOHNS HOPKINS. 
XX 

PI Bruskin A, Jarosz DE, Johnson K, Kinzler KW, Vogelstein B; 

PI Zabrecky JR; 

XX 

DR WPI; 1995-022830/03. 
DR P-PSDB; Q80196, 
XX 

PT Antibodies specific for tumour suppressor gene product, DCC - 
PT useful for detecting expression of DCC gene, for cancer diagnosis, 
XX j 
PS Claim 4; Page 24-28; 39pp; English, 
XX 

CC The protein represents the DCC tumor suppressor, and epitopes are 
identified which are used used jin the generation of polyclonal or 
H or monoclonal antibodies against DCC. The antibodies^ can detect 
W DCC protein in biological samples (including tumour tissue, 
CC peripheral blood mononuclear cells or a tumour biopsy lysate) 
CC despite low levels of DCC expession, and are therefore useful in 
CC cancer, especially colorectal carcinoma, diagnosis. 

XX 

SQ Sequence 1447 AA; 



Query Match 7.6»; Score 551.5; DB 16; Length 1447; 

Best Local Similarity 21.84; Pred. No. 1.3e-25; 

Matches 312; Conservative 177; Mismatches 542; Indels 397; Gaps 64; 

Oy 5 RIIEHPMDTTVPKNDPFTFNCQAEGN-PTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFL 63 

I : I I : :| II : I 1:1 III I : I I I 

Db 41 rflsepsdavtmrggnvlldcsaesdrgvpvikwkkdgihlalgmderkqqlsngslliq 100 

Qy 64 KVIHSR-RESDAGTYWCEAK-NEFGVARSRNATLQVA-VLRDEFRLEPANTRVAQGEVAL 120 

-III : I I I III : I II I : II II I : : |: I 
Db 101 nilhsrhhkpdeglyqceaslgdsgsiisrtakvavagplr--flsqtesvtafmgdtvl 158 

Qy 121 MECGAPRGSPEPQISWRKNGQTLN-LVGNXRIRIVDGGNLAIQEARQSDDGRYQCWKNV 179 

::| I I I I 1:11 II : I: I: :: I I I I I I : I : I 



Db 159 lkcev-igeprnptihwqknqqdltpipgdsrwvlpsgalqisrlqpgdigiyrcsarnp 217 

Qy 180 VGTRESATAFLKV HVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTA 233 

:| I I:: I : : : : I I I : I I : I : I I I I I 
Db 218 assrtgneaevrilsdpglhrqlyflqrpsnwaiegkdavleccvsgypppsftwlrge 277 

Qy 234 SGGNMPLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTV 293 

: =1:1 : :| : :|| :| I III hi: III 

Db 278 eviqlrskkys llggsnllisnvtdddsgmytcvvtyknenisasaeltv 327 

Qy 294 HAPPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLT 353 

II I: II :: III :| I ||: I |: :::| :: 
Db 328 lvppwflnhpsnlyayesmdiefectvsgkpvptvnwmkngd--vvip---sdyfqiv-- 380 

Qy 354 PEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTL 413 

I I I I i : I I MM:: I :| I I : 

Db 381 --ggsnlrilgwksdeg-fyqcvaeneagnaqt saqlivpkpaipsssvlpsa 431 

Qy 414 PVKSIWLPCRTLGTPVPQVSWYLDGIPIDV QEHERR-— NLSDAGA 457 

I : II . : : : ::|| I : :| : I I : |: 

Db 432 prdwpvl----vssrfvrlsw---rppaeakgniqtftvffsregdnreralnttqpgs 484 

Qy 458 - -LTISDLQRHEDEGLYT - -CVASNRNGKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGP 513 

II: :l : I HI II I I :: II III III 

Db 485 lqltvgnl- — kpeamytf rwaynewgpge ssqpikvatqpelqv-pgp 530 

Qy 514 PGKPQMVEKGENSVTLSWTRSNKVGGSSLVGYVIEMF GKNETDGWVAVGTRVQN 567 

II ; I: ::| I : II :| II : I 
Db 531 ven lqavs ts pts i 1 i tweppayang - pvqgy - - r 1 f c tevs tgkeq nievdg 580 

Qy 568 TTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGD 627 

:: II '. I 11:1 I :: III I lh :: 

Db 581 lsykleglkkfteyslrflaynryg— pgvstdditwt lsdvpsappqnv 629 

Qy 628 WELSNASWDSTSMKLTW QIINGKYVEGFYV 659 

:| lh |:|::| I :: |: : 

Db 630 sle wrisrsikvswlpppsgtqng-fitgykirhrkttrrgemetlepnnlwylf 683 

Qy 660 YARQLPNPIVNNPAP— -VTSNT-NPL— -LGSTSTSASASASASALIS 701 

1:1: III 1:||| : :| : ;| 
Db 684 tglekgsqysfqvsamtvngtgppsnwytaetperldldesqvpdqpsslhvrpqtnciim 743 

Qy 702 T KPNIAAAGKRDG EI NQSG 720 

: III I I II I :| 

Db 744 swtpplnpnivvrgyiigygvgspyaetvrvdskqryysierlessshyvislkafnnag 803 

Qy 721 GGAP - TPL 727 

I I lh 
Db 804 egvplyesattrsitdptdpvdyypllddfptsvpdlstpmlppvgvqavalthdavrvs 863 

Qy 728 -NTKyRMLTI - -LNGGGAS SCT ITGLVQYTLYEFF IVPF 763 

: :: I: I: II II III MM :: 

Db 864 wadnsvpknqktsevrlytvrwrtsfsasakyksedttslsytatglkpntmyefsvmvt 923 

Qy 764 YKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSS • - AVFLKWKAPELKDRHGVLLNYH 821 

I : I I I I: II : II : h I : :| : I 

Db 924 knrrsstwsm'tahattyeaaptsapkdftvitregkpravivswqpp--leangkitayi 981 

Qy 822 VIVRGIDTAHNFSRILTNV TIDAASPTLVLANLTEGVMYTVGVAAGNNAGVG 873 

: I : I: II I : :| II : I I: III 

Db 982 1 •fytldknipiddwimetisgdrlthqimdlnldtmyyfriqarnskgvg 1031 

Qy 874 PYCVP--ATLRLD PITKRL DPFINQRDHVNDVLT 905 

I I II::: I: I :| I I : :| 

Db 1032 plsdpilfrtlkvehpdkmandqgrhgdggywpvdtnlidrstlneppigqnthpphgsvt 1091 

Qy 906 QP WF! ILLGAILAVLMLSFGAMVFVKRKHMMMKQSALNTMRGNH • • TSDVLKMP 957 

■I:: :: II:: |:: :| :: I : I: I 
Db 1092 pqknsnllviiwtvgvitvlvvvivavictrrssaqqrkkrathsagkrkgsqkdlrpp 1151 

Qy 958 SLSARNGNGYWLDSSTGGM-VWRPS— PGGDSLEMQKDHIADYAPVCGAPGSPAGGGT 1012 

I I: I : HI I I :| III: I 

Db 1152 dl wihheememkniekpsgtdpagrdspiqs--cqdltpvshsqsetqlgsk 1201 
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Qy 1013 SSGGSG----GAGSGASGGDDIHGGHGSERNQQRY-VGEYSNIPTDYAEVSSFGKAPSEY 1067 

I: II III I : : I : : II I I II: I 
Db 1202 stshsgqdteeagssmstlerslaarrapraklmipmdaqsnnp---avvsaipvptles 1258 

Qy 1068 GRHGNASPAPYATSSILSPHQQQQQQQPRYQQRPVP GYGLQRPMHPHYQQQQ 1119 

:: IM : II I:: MM hi I 
Db 1259 aqypg i lpsp — tcgyph pqftlrpvpfptlsvdrgfgagr 1297 

Qy 1120 HQQQQAQQTHQQHQALQQHQQLPPSNIYQQMSTTSEIYPTNTGPSRSV 1167 

I: : II llll I ::M Ml: I 
Db 1298 sqsvsegpttqqppmlpps---qpehssseeapsrtiptacv 1336 



RESULT 12 
Y33498 



ID Y33498 standard; Protein; 1447 AA, 



Y33498; 

19-JAN-2000 (first entry) 



Human DCC protein. 



DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
KW 
KW 
XX 
OS 

PN W09945944-A1. 
XX 

PD 16-SEP-1999. 



; DCC; 
; SCA6; 



Proapoptotic; dependence domain; p75NTR; androgen receptor; 
huntingtin polypeptide; Machado-Joseph disease; SCAl; SCA2; 
atrophin-1; cell death; apoptosis; Huntington's disease; head trauma; 
Alzheimer's disease; Kennedy's disease; spinocerebellar ataxia; stroke; 
dentatorubropallidoluysian atrophy; cell proliferation; cell survival; 
neoplastic; malignant; autoimmune; fibrotic. 

Homo sapiens. 



PF 
XX 
PR 
XX 
PA 
XX 

PI 

XX 

f 

PT 
PT 
XX 
PS 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc . 
cc 
cc 
cc 
cc 
cc 
cc 

XX 
SQ 



11- MAR-1999; 99WO-OS05250. 

12- HAR-1998; 98DS-0041886 . 



(BURN-) 8URNHAM INST. 

Bredesen DE, Rabizadeh S; 



WPI; 1999-561617/47. 
N-PSDB; 223431. 



New proapoptotic dependence peptides, used to develop products for 
treating, e.g. Alzheimer's 



Disclosure; Page 164-168; 199pp; English. 

This invention describes novel pure proapoptotic dependence peptides 
which comprise a sequence of an active dependence domain selected from 
dependence polypeptides consisting of p75NTR, androgen receptor, DCC, 
huntingtin polypeptide, Machado-Joseph disease gene product, SCAl, SCA2, 
SCA6 and atrophin-1 polypeptide, The proapoptotic peptides are capable 
$f inducing cell death and can be used to develop products to mediate or 
inhibit apoptosis^ The methods can be used for reducing the severity of 
a proapoptotic dependence domain mediated pathological conditions e.g. 
Huntington's disease, Alzheimer's disease, Kennedy's disease, 
Spinocerebellar ataxias, dentatorubropallidoluysian atrophy, 
Machado-Joseph disease, stroke or head trauma. They can also be used for 
reducing the severity of a pathological condition mediated by upregulated 
cell proliferation or cell survival e.g. neoplastic, malignant, 
autoimmune or fibrotic conditions. This sequence represents the human 
DCC (deleted in colonic cancer) polypeptide described in the method of 
the invention. 

Sequence 1447 AA; 



Query Match 7.6%; Score 551.5; DB 20; Length 1447; 

Best Local Similarity 21.84; Pred. No. 1.3e-25; 

Matches 312; Conservative 177; Mismatches 542; Indels 397; Gaps 64; 

5 RIIEHPMDTTVPKNDPFTFNCQAEGN-PTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFL 63 

1:11: :l II : I |:| III I : I I I 
41 rflsepsdavtmrggnvlldcsaesdrgvpvikwkkdgihlalgmderkqqlsngslliq 100 

64 KVIHSR-RESDAGTYWCEAK-NEFGVARSRNATLQVA-VLRDEFRLEPANTRVAQGEVAL 120 

-III : M I III : I II I : II II I : : hi 
101 nilhsrhhkpdeglyqceaslgdsgsiisrtakvavagplr--flsqtesvtafmgdtvl 158 

121 MECGAPRGSPEPQISWRKNGQTLN-LVGNKRIRIVDGGNLAIQEARQSDDGRYQCWKNV 179 

:: MM 1:11 I I : h h :: I II : II hi :l 
159 lkcevigepmptihwqknqqdltpipgdsrvwlpsgalqisrlqpgdigiyrcsarnp 217 

180 VGTRESATAFLKV HVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTA 233 

:I I :::: I : : :: I I h I I :| M II II 
218 assrtgneaevrilsdpglhrqlyflqrpsnvvaiegkdavleccvsgypppsftwlrge 277 

234 SGGNMPLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTV 293 

: :hl :l :l : :ll :l I III hh III 

278 eviqlrskkys llggsnllisnvtdddsgmytcvvtyknenisasaeltv 327 

294 HAPPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLT 353 

III: II :: III :| I lh I h :::| :: 

328 lvppwflnhpsnlyayesmdiefectvsgkpvptvnwmkngd--wip---sdyfqiv" 380 

354 PEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTL 413 

II I I ■ : I I I I I I: : Mill M 

381 --ggsnlrilgvvksdeg-fyqcvaeneagnaqt saqlivpkpaipsssvlpsa 431 



414 PVKSIWLPCRTLGTPVPQVSWYLDGIPIDV QEHERR — NLSDAGA 457 

I : II ' : : ::|| I : :| : I I : h 

432 prdvvpvl----vssrfvrlsw---rppaeakgniqtftvffsregdnreralnttqpgs 484 

458 "LTISDLQRHEDEGLYT- -CVASNRNGKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGP 513 

II: :l : M llll :: II III III 

485 lqltvgnl---kpeamytfrvvaynewgpge ssqpikvatqpelqv-pgp 530 

514 PGKPQMVEKGENSVTLSWTRSNKVGGSSLVGYVIEMF GKNETDGWVAVGTRVQN 567 

II ' I: ::| M II :| II : I 
531 venlqavstsptsilitweppayang-pvqgy-rlfctevstgkeq nievdg 580 

568 TTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGD 627 

:: II I Ml:: III I ||: :: 

581 lsykleglkkfteyslrflaynryg---pgvstdditvvt lsdvpsappqnv 629 

628 WELSNASWDSTSMKLTW QIINGKYVEGFYV 659 

:| Mil hh:| II :: h : 

630 sle wnsrsikvswlpppsgtqng-fitgykirhrkttrrgemetlepnnlwylf 683 

660 YARQLPNPIVNNPAP VTSNT--NPL LGSTSTSASASASASALIS 701 

h.h II I h I II : :| : :| 

684 tglekgsqysfqvsamtvngtgppsnwytaetpendldesqvpdqpsslhvrpqtnciim 743 

702 T KPNIAAAGKRDG ET NQSG 720 

: III; I I II M 

744 swtpplnpniwrgyiigygvgspyaetvrvdskqryysierlessshyvislkafnnag 803 

721 GGAP : TPL 727 

II' lh 
804 egvplyesattrsitdptdpvdyypllddfptsvpdlstpmlppvgvqavalthdavrvs 863 



Qy 728 - NTKYRMLT I - - LNGGGAS SCTITGLVQYTLYEFFIVPF 763 

; :: h h II II III hill :: 

Db 864' wadnsvpknqktsevrlytvrwrtsfsasakyksedttslsytatglkpntmyefsvmvt 923 

Qy 764 YKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSS - - AVFLKWKAPELKDRHGVLLNYH 821 

MIIIM : || : |: | : :| ; 

Db 924 knrrsstwsmtahattyeaaptsapkdf tvitregkpravivswqpp- - leangkitayi 981 



Qy 



822 VIVRGIDTAHiJFSRILTNV T IDAASPTLVLANLTEGVMYTVGVAAGNNAGVG 873 
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: 1:1: II I : :| II : I |: III 

Db 982 1 fytldknipiddwimetisgdrlthqimdlnldtmyyfriqarnskgvg 1031 

Qy 874 PYCVP— MLRLD PITKRL DPFINQRDHVNDVLT 905 

I I II::: |: I :| I I : :| 

Db 1032 plsdpilfrtlkvehpdkraandqgrhgdggywpvdtnlidrstlneppigqmhpphgsvt 1091 

Qy 906 QP WFIILI/3AIIAVLMLSTOAMVFVKRKHMMMKQSALNTMRGNH- -TSDVLKMP 957 

I:: :: I!:: |:: :l :: I : |: I 
Db 1092 pqknsnllviiMvgvitvlmivavictrrssaqqrkkxathsagkrkgsqkdlrpp 1151 

Qy 958 SLSARNGNGYWLDSSTGGM- -VWRPS-- - PGGDSLEMQKDH IADYAPVCGAPGSPAGGGT 1012 

I h I : :ll I I :l I II : I 

Db 1152 dl wihheememkniekpsgtdpagrdspiqs--cqdltpvshsqsetqlgsk 1201 

Qy 1013 SSGGSG----GAGSGASGGDDIHGGHGSERNQQRY-VGEYSNIPTDYaWsSFGKAPSEY 1067 

I: II llll: : I : : III rlh I 
Db 1202 stshsgqdteeagssmstlerslaarrapraklmipmdaqsnnp---avvsaipvptles 1258 

Qy 1068 GRHGNASPAPYATSSILSPHQQQQQQQPRYQQRPVP GYG LQRPMHPH YQQQQ 1119 

A :: 1:1 : II |:: llll |:| I 

Ad 1259 aqy pg i 1 ps p — tc gyph pqftlrpvpfptlsvdfgfgagr 1297 

Ty 1120 HQQQQAQQTHQQHQALQQHQQLPPSNIYQQMSTTSEIYPTNTGPSRSV 1167 
I: : II llll I ::H |: I |: I 
Db 1298 sqsvsegpttqqppmlpps— qpehssseeapsrtiptacv 1336 



RESULT 13 
W57900 

ID W57900 standard; Protein; 1192 AA. 
XX 

AC W57900; 
XX 

DT 25-SEP-1998 (first entry) 
XX 

DE Protein of clone C0722J. 

fXX 

KW Human; nutritional supplement; cell proliferation/differentiation; 

KW cytokine; immunostimulant; immunosuppressant; haematopoiesis regulator; 

KW receptor/ligand activity; cadher in/tumour invasion suppressor; 

KW anti-inflammatory; tumour inhibitor; clone C0722.1, 
XX 

OS Homo sapiens. 
XX 



PN WO9824905-A2. 



PD ll-JUN-1998. 
XX 



05-DEC-1997; 97WO-US22211. 



03-DEC-1997; 97US-0984516 . 
05-DEC-1996; 96US-0762216. 



(GEMY ) GENETICS INST INC, 



PI Agostino MJ, Jacobs K, Lavallie ER, McCoy JM, Merberg D; 

PI Racie LA, Spaulding V, Treacy H; 

XX 

DR WPI; 1998-333324/29. 

DR N-PSDB; V40887. 
XX 

PT New isolated polynucleotides encoding secreted polypeptides - 

PT isolated from a human foetal kidney cDNA library, a human adult 

PT blood cDNA library or a human adult brain cDNA library 

XX 

PS Claim 36; Page 81-85; 109pp; English. 

XX i 

CC This sequence represents the protein of clone C0722J, of the 

CC invention. DNA encoding this sequence was isolated from a human adult 

CC brain cDNA library. The DNAs and proteins can be used as nutritional 

CC sources or supplements, or may exhibit cytokine and cell 

CC proliferation/differentiation activity, immune stimulating or suppressing 



CC activity, haematopoiesis regulating activity, receptor/ligand activity, 

CC anti- inflammatory activity, cadherin/tumour invasion suppressor activity, 

CC tumour inhibition activity or other activities. 

XX 

SQ Sequence 1192 AA; 



Query Match ' 7.5%; Score 548.5; DB 19; Length 1192; 

Best Local Similarity 21.2%; Pred". No. 1.5e-25; 

Matches 273; Conservative 157; Mismatches 510; Indels 345; Gaps 

Qy 3 NPRIIE HPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSH 51 

:|:::| ' II:: |:|:| I I: I "I I 
Db 35 dpklledlvqpptitqqspkdyiidpreniviqceakgkpppsfswtrngthfdidkdpl 94 



52 RIMLPAGGLFFLKVI -HSRRESDAGTYWCEAKNEFGVARSRNATLQV- -AVLRDEFRLEP 108 

III-::: : |: I I I hi! Ill):: : I : :lll 
95 vtmkpgtgtliinimsegkaetyegvyqctarnergaavsnnivvrpsrsplwtkeklep 154 

109 ANTRVAQGEVALMECGAPRGSPEPQISWRKNG 140 

: I: :: I I I I I I I I 
155 it- -lqsgqslvlpcrppiglpppiifwmdnsfqrlpqservsqglngdlyf snvlpedt 212 



141 



■-NLVGNKRIR" 



■ 152 



■ QTL 

' II: I = :l 

213 redyicyarfrihtqtiqqkqpisvkvisakssrerpptfltpegnasnkeelrgnvlsle 272 

153 -IVDG * GNLAIQEARQSDDGRYQCWKNWGTRES 185 

I :| I I ::| I III: II :l 

273 ciaeglptpiiyvakedgmlpknrtvyknfektlqiihvseadsgnyqciaknalgaihh 332 

186 ATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNMPLRKFSW 245 

I :: J: I III I II 1:1 I : II 

333 -tisvrvkaapywitapqnlvlspgedgtlicrangnpkp risv 375 

246 LHSASGRVHVLEDRSLKLDDVTL EDMGEYTCEADNAVGGITATGILTVHAPPK 298 

I : -:| I |:| |: I I I I I : I : I I I 
376 ltngvpieiapddpsrkidgdtiifsnvqerssavyqcnasneygyllanafvnvlaepp 435 

299 FVIRPRNQLVEI-GDEVLFECQANGHPRPTLYWSVEGNSSLLLPG-— YRDGRMEVTL 352 

:: I I I i: MINI: II : :| :|: : 

436 riltpantlyqvianrpalldcaffgsplptievfkgakgsalhediyvlhengtleipv 495 

353 TPEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSV-DTQFELPPP--IIEQGP 408 

|::|| MM: : : : I : : I ::::l 
496 aqkdstgtytcvarnklgmaknevhleikdatwivkqpeyavvqrg- 541 

409 VNQTLPVKSIWLPCR TLGTPVPQVSWYLDG--IPIDVQEHERRNLSDAGALTIS 461 

1:1 I: II I I I :l I III :: 
542 smvsfeckvkhdhtlsl---tvlwlkdnrelpsd erftvdkdhlwa 585 

462 DLQRHEDEGLyTCVASNRNGKSSWSGYLRLDTPT-NPNIKFFRAPELSTYPGPPGKPQMV 520 

|: :l I Mlh I I I : II I II : I II :: 
586 dvs-dddsgtytcvanttldsvsasavlsvvaptptp ap-vydvpnppfdlelt 637 

521 ERGENSVTLSVJTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNT-TFTQTGLLPGV 579 

:: : II llll : I : ::|| I I I II I I II 

638 dqldksvqlsvftpgdd-nnspitkfiieyedamhkpglwhhqtevsgtqttaqlklspyv 696 

580 NYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGDWELSNASWDS 639 

II I : I III llll II :l : : I : I: 

Db 697 nysfrvmavnsigkslpsease qyltkasepdk nptaveg 736 

Qy 640 TSMKLTWQIINGKYVEG FYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSA 690 

:: :||: :|| I : I II I : : :: I I 

Db 737 lgsepdnlvitwkplngfesngpglqykvswrqkdgddewtsvwanvskyivsgtptfv 796 



691 SASASASAL- - ISTKPNIAAAGKRDGE TNQSGGGAPTPLNT- 729 

II : I I II I I II : 

797 pylikvqalndmgfapepavvmghsgedlpmvapgnvrvnwnstlaevhwdpvplksir 856 



730 - 



■ -KYRMLTILNGGGASSCTITGLVQYTLYEFFIVPFYK 765 

: ::|| I : : II :: I : 
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Db 857 ghlqgyriyytfktqssskrnrrhiekkiltf--qgskthgmlpglepfshytlnvrwng 914 

Qy 766 SVEGKPSNSRIARTLEDVPSEAPrGMEALLLNSSAVFLKWKAPELKDRlGVLLNYHVIVB 825 

II I I: I I III II : hi I I : : 

Db 915 kgegpaspdrvfntpegvps-apsslkivnptldsltlewdpp-shpngilteytlkyq 971 

Qy 826 GIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPyCVPATLRLDP 885 

I:: I I :: I I III I I :|l I : 
Db 972 pinsthelgp-lvdlkipanktrwtlknlnfstrykfyfyaqtsagsgsqiteeavttvd 1030 

Qy 886 ITKRLDPFIN — QRDHVNDVLTQPWFIILLGAILAVLMLSFGAMVFVKRKH MM 936 

II: I: II III I: |: hhl : |::| : 

Db 1031 eagilppdvgagkamasrqvdiatqgwfiglmcavallilillivcflrmkggkypvk 1089 

Qy 937 MKQSALNTMRGNHTSDVLRMPSLSARNGNGYWLDSSTGGMVTO-— PSPGGDSLEMQKD 992 

hi I " I I : I: : : II II 

Db 1090 ekeda hadpeiq-pmkeddgtfgeysdaedhkplkkgsrtpsdrtvkkedsdd 1141 

•993 HIADYAPVCGAPGSPAGGGTSSGGSGGAGSGASGGDDIHGGHGSERNQQRYVGEYSNIPT 1052 
: II I I :| : I ::|:|| 
Db 1142 slvdy | gegvngqfnedgs figqys— - 1165 

Qy 1053 DYAEVSSFGKAPSEYGRHGNASPAP 1077 

I f?l I : :hl 
Db 1166 gkkekepae-gnesseapsp 1184 



RESULT 14 
I W29667 

' ID W29667 standard; Protein; 1028 AA. 

AC W29667; 
XX 

DT 18-FEB-1999 (first entry) 

XX 

de Homo sapiens DL185_1 clone secreted protein. 
XX 

KW secreted protein; DL185.1. 
XX 

OS Homo sapiens. 

XX 

PN WO9830695-A2. 
XX 

PD 16-JUL-1998. 
XX 



09-JAN-1998; 98WO-DS00543. 



i 

^ 08-JAN-1998; 98OS-QQ04684. 
PR 09-JAN-1997; 97US-0780814 . 
XX 

PA (GEMY ) GENETICS INST INC, 
XX 

PI Agostino MJ, Jacobs K, Lavallie ER, McCoy JM, Merberg D; 

PI Racie LA, Spaulding V, Treacy M; 

XX 

DR WPI; 1998-413686/35. 
DR N-PSDB; V40529. 
XX 

PT New isolated nucleic acids and secreted proteins ■ obtained from 
PT human adult ovary, human foetal kidney, human foetal brain and human 
PT adult brain cdna libraries 
XX 

PS Claim 36; Page 91-94; 113pp; English. 
XX 

CC 1 The sequence is that of a novel, isolated secreted protein, 
XX 

SQ Sequence 1028 AA; 



Query Match 7.5%; Score 543.5; DB 19; Length 1028; 

Best Local Similarity 23.0%; Pred. No. 2.4e-25; 

Matches 263; Conservative 124; Mismatches 425; Indels 333; Gaps 44; 



Qy 4 PRIIEHPMDTTVP— KNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGL 60 

I : I I ' I || | | |:| :| ::| :: | |: I 

Db 26 piftqephdvifpldlsksevilncaangypsphyrwkqngtdidf-tmsyhyrldggsl 84 

Qy 61 FFLKVIHS-RRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVA 119 

1:1 ' : I I I I I I I II I II I : I : I :|: 
Db 85 ■---ainsphtdqdigmyqclatnllgtilsrkaklqfayiedfetktrstvsvregqgv 140 

Qy 120 LMECGAPRGSPEPQISWRKNGQTLNL-VGNKRIRIVDGGNLAIQEARQSDDGRYQCWKN 178 

:: II I : :l I I : 1:1 : 1 1 1 I : 1 1 I I I : I 
Db 141 vllcgppphfgdlsyawtfndnplyvqednrrfvsqetgnlyiakvepsdvgnytcfitn 200 

Qy 179 WGTR ' ESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVL 228 

I ' : : : :| |: I III :| |:|:||: 
Db 201 keaqrsvqgpptplvqrtdgvmgeyepkievrfpetiqaakdssvklecfalgnpvpdis 260 

Qy 229 WRRTASGGNMPLRKFSWLHSASGRVHVLEDRS-LKLDDVTLEDMGEYTCEADNAVGGIIA 287 

III I :1 hi : :: h: : II I I II I I I 

Db 261 wrr-ldgsplp gkvkysksqaileipnfqqedegfyeciasnlrgrnla 308 

Qy 288 TGILTVHAPPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEG N 335 

I I =111:: : :| : I I :hlhhl I I I I I 
Db 309 kgqlifyappeweqkiqnthlsiydnllweckasgkpnpwytwlkngerlnpeeriqien 368 

Qy 336 SSLLLPG YRDGRMEV 350 

:h: I : : I 

Db 369 gtliitmlnvsdsgvyqcaaenkyqiiyanaelrvlasapdfskspvkkksfvqvggdiv 428 

Qy 351 ; TLTPEGR SVLSIARFAREDSGKWTCNALNAVGS 384 

II I II I hi II I I h 
Db 429 igckpnafpraaiswkrgtetlrqskriflledgslkiynitrsdags-ytciatnqfgt 487 

Qy 385 VSSRTWSVDTQFELPPPIIEQGPVNQTLPVKSIWLPCRTLGTPVPQV-S 434 

I llh:| I : I :|llh I :| 

Db 488 akntgslivkertvitvp pskmdvtvgesivlpcqvshdpsievvfv 534 

Qy 435 WYLDGIPID-'-;-VQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRNGKSSWSGYLR 490 

I: :h 11,1 II II I ::l I III 
Db 535 wffngdvidlkkgvahferiggesvgdlmirniqlhh-sgkylctvqt 581 

Qy 491 LDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVT— LSWTRSNKVGGSSLVGYVI 547 

■ I INI h h : I I III h I : : I 
Db 582 tleslsavadiivrgppgppedvqvedissttsqlsw-ragpdnnspiqifti 633 

Qy 548 EMFGKNETD---GWVAVGT RVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPS 597 

: I II II I : I I II I I I I : I II h II 
Db 634 q----trtpfsvgwqavatvpeilngktynatv--vglspwveyefrvvagnsigigeps 687 

Qy 598 PMSE -PITVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKLTWQII 649 

II ,.h I II : :|h I 

Db 688 epsellrtkasvpvvapvnihggggsrse lvitwesi 724 

Qy 650 NGKYVEGF-YVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSAS--ASASASALIS 701 

II III h h I : 1 1 1 : I :| :| : 

Db 725 peelqng---egfgyi imfrp vgsttwskekvssvessrfvy 763 

Qy 702 TRPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIV 761 

:| ■ ||: : : I I 
Db 764 rnesi -■ iplspfevkvgvynneg 785 

Qy 762 PFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDRHGVLLNYH 821 

1111:111111 ::| : : I I I :| I 
Db 786 egslstvtivysgedepqlaprgtslqsfsasemevswnaiawnrntgrvlgye 339 

Qy 822 VIVRGIDTAHNF- - -SRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVP 878 

h I: : hill I : I :| I I I II II I 
Db 840 vlywtdds kes'migk irvsgnvt tknitglkantiyfasvrayntagtgpsspp 893 

Qy 879 ATLRLDPITKRLDPFINQRDHVNDVLTQP WFI ILLGAILAVLMLSFGAMVF 929 

: II:. I :H I : : I I : :: 

Db 894 vnv — ttkk'spp sqppaniawkltnsklclnwehvktmenesevlg 937 

Qy 930 VKRKHMMMKQSALNTMRGNHTSDVLKMPSLSARNGNGYWLDSSTGGMVWRPSPGGDSLEM 989 
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I : =11 : : 1 : 1 1 I :| 
938 ykilyrqnrqskthiletnntsaellvpf-- 



Qy 990 QKDHI 994 
:: I 

Db 987 eeiri 991 



RESULT 15 
W42087 

ID W42087 standard; Protein; 1571 AA. 
XX 

AC W42087; 



I I I III 
-eedylieirt— 



KW 
KW 

I 



28-SEP-1998 (first entry) 

Human Down syndrome-cell adhesion molecule DS-CAM2. 

DS-CAM2; Down syndrome-cell adhesion molecule; neural cell; 
signal transduction; trisomy 21; mental retardation; 
holoprosencephaly; corpus callosum agenesis; 
schizencephaly; diagnosis; assay; human, 

Homo sapiens. ' 

W09817795-A1. 



XX 
PF 
XX 

PR 25-OCM996; 



i 



23-OCT-1997; 97WO-US19547 . 



(CEDA-) CEDARS SINAI MEDICAL CENT. 

Korenberg JR; 

WPI; 1998-271791/24. 
N-PSDB; V31988. 

New isolated Down's Syndrome-cell adhesion molecule - used to 
develop products for detection, diagnosis and therapy of 
developmental and neurological abnormalities 



Claim 2; Page 90-95; 109pp; English. 

This polypeptide comprises Down syndrome-cell adhesion molecule 
DS-CAM2, an extracellular soluble protein belonging to a novel 
subclass of the Ig superfamily with highest homology to neural cell 
adhesion molecules . Its amino acid sequence was deduced from cDNA 
clones (see V31982) isolated from a trisomy 21 foetal brain library. 
It is a splice variant of membrane-bound DS-CAMl (see W42086), and 
CC lacks the entire transmembrane domain of DS-CAMl. The invention 
CC provides human and murine DS-CAM nucleic acid sequences (see also 
CC V31981, V31985-87), expression vectors and host cells, transgenic 
CC animals, antibodies, antisense oligonucleotides, and primers 
CC derived from DS-CAM nucleic acids. DS-CAM polypeptides are associated 
CC with developmental and neurological processes, They can be used in 
CC e.g. neural prosthetic devices used in entubulation methods of 
CC repairing (regenerating) damaged or severed peripheral nerves, and 
CC also in bioassays to identify agonists and antagonists. The products 
CC can also be used in detection, diagnosis and therapy of developmental 
CC and neurological abnormalities such as Down syndrome, mental 
CC retardation, holoprosencephaly, agenesis of the corpus callosum, 
CC or schizencephaly. 
XX 

SQ Sequence 1571 AA; 



Query Match 7.4»; Score 538; DB 19; Length 1571; 

Best Local Similarity 21.7*; Pred. No. 9.5e-25; 

Matches 286; Conservative 166; Mismatches 458; Indels 410; Gaps 55; 



25 CQAEGNPTPTIQWFKD GRELKTDTGSHRIMLPAGGLFFLKVIHSRRESDAGTY 77 

hi hi I :! II II II II :l : I Ihhl 

246 ckalghpepdyrwlkdnmplelsgrfqktvtg llienirpsdsgsy 291 

78 WCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVALMECGAPRGSPEPQISWR 137 

II I :| h I: : I : : I : I |: : ::|| 

292 vcevsnrygtakvigrlyvkqplk--atisprkvkssvgsqvslscsv-tgtedqelswy 348 

138 KNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWK 177 

:||: II Ml II : =11 I III h 

349 rngeilnpgknvritginhenlimdhmvksdggayqcfvrkdklsaqdyvqwledgtpk 408 



Qy 178 - NWGT 182 

II II 



Db 409 iisafsekwspaepvslmcnvkgtplptitwtldddpilkggshrisqmitsegnvlsy 468 

Qy 183 RESATAFL— KVHVR-PFLIRGPQNQTAWGSSWFQCRIG 220 

? II I :::ll I II :l lh I lh 

Db 469 lnisssqvrdggvyrctannsagvvlyqarinvrgpasirpmknitaiagrdtyihcrvi 528 

Qy 221 GDPLPDVLWRRTAS 234 

II : I ; :: 

Db 529 gypyysikwyknsnllpfnhrqvafenngtlklsdvqkevdegeytcnvlvqpqlstsqs 588 

Qy 235 - GGNMPLRKFSWLHSA- - - SGRVHVLEDR 259 

I::|: :| I : I I 

Db 589 vhvtvkvppfiqpfefprfsigqrvfipcvvvsgdlpi-titwqkdgrpipgslgvtidn 647 

Qy 260 SLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPKFVIRPKNQLVEIGDEV 314 

lh: i::| I III I I : II llllh:h:l I I 
Db 648 idftsslrisnlslmhngnytciarneaaavehqsqlivrvppkfwqprdqdgiygkav 707 

Qy 315 LFECQANGHPRPTLYWSVEGNSSL--LLPGYRDGRMEVTLTPEGRSVLSIARFAREDSGK 372 

: I I hi lh I : : I :||::| I I III 

Db 708 ilncsaegypvptivwkfskgagvpqfqpialngriqvl — sngsllikhweedsgy 763 

Qy 373 WTCNALNAVGS-VSSRTWSVDTQFELPPPIIEQGPVNQTLPVK-SIWLPCRTLGTPV 430 

: I I II: I ::| I :| I I II : : I I 

Db 764 yl*ckvsndvgadvsksmyltvki pamitsyp-nttlatqgqkkemsctahgekp 816 

Qy 431 PQVSWYLDGIPIDVQEHERR NLSDAGALT ISDLQ — RHEDEGLYTCV 475 

II III : : I II II III ::| 

Db 817 iivrw * — ekedriinpemarylvstkevgeevistlqilptvredsgffsch 866 

Qy 476 ASNRNG KSSWSGY LRLDT PTNPNI KFFRAPELST Y" PGPPGKPQMVEKG - - ENSVTLSWT R 533 

I I h ! ::l I II h: I ::ll II 

Db 867 ainsyged-rgiiql tvqeppdppeieikdvkartitlrwtm 907 

Qy 534 SNKVGGSSLVGYVIEMFGKNETDGWVA VGTRVQNTTFTQTGLLPGVNYFFLIRA 587 

II : 1.1 II lh:| I : I :: : I : I I : I 
Db 908 gfd-gnspitgydiec--knksdswdsaqrtkdvspqlnsatiid--ihpsstysirmya 962 

Qy 588 ENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGDWELSNASVVDSTSMKLTWQ 647 

:| Mil :|: I : I h : M |:::||: 

Db 963 knrigksep---snelti tadeaapdgppqev-hlepissqsirvtwk 1006 

Qy 648 IINGKYVEGFYVYARQLP— NPIVNNPAPVTS NTNP LLGS 685 

: I! : I: : I: I I : II II |: 

Db 1007 apkkhlqng-iirgyqigyreystggnfqfniisvdtsgdsevytldnlnkftqyglvvq 1065 

Qy 686 TSTSASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLT - - ILNG — 739 

I h. :hl : I h :l :: : h III 

Db 1066 acnragtgpssqeiitt-tledvpsyppenvqaiatspesisiswstlskealngilqg 1123 

Qy 740 - — GGASSCTIT GLVQYTLYEFFIVPFYKSVEGKPSNSRIAR 778 

I : I I 11:111 : : I : : : I I I 

Db 1124 frviywanlmdgelgeiknitttqpsleldglekytnysiqvlaftragdgvrseqiftr 1183 

Qy 779 TLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDRHGVLLNYHVIVRGIDTAHNFSRILT 838 

I llll M::| ::| lh I I II :|:: II :M ::: 

Db 1184 tkedvpg-ppe.gvkaaaasasmvfvsw-lpplk-lngiirkytvf cshpyptvis 1235 

Qy 839 NVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPITK- - -RLDPFIN 895 
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I : : ||: |:| II :||| : ::|: | |: I 
Db 1236 efeaspdsfsyripnlsrnrqysvwwavtsagrgn-sseiitveplakaparlltf-- 1291 

Qy 896 QRDHVNDVLTQPWFIILLGAILAVLMLSFGAM7FVKRKHMNMKQSALNTMRGNHTSDVLK 955 

: :| II : ::| I: III I I :: 

Db 1292 sgtvttpw mkdivlpckavgdpspavkwmkds ngtpslvt 1331 

Qy 956 MPSLSARNGNGYWL DSSTGGMVWRPSPGGDS-LEMQKDHIADYAPVCGAPGS 1006 

: : II :: II : : I I I :| I I 
Db 1332 idgrrsifsngsfiirtvkaedsgyysciannnwgsdeiilnlqvqvppd qprl 1385 

Qy 1007 PAGGGTSSGGSGGAGSGASGGDDIHGGHGSERNQQRYVGEYSNIPTDYAEVSSFGKAPSE 1066 

III : Mill |: :|| : || :||| 

Db 1386 tvskttsssitlswlpgdaggssirg yilqysednse- -qwgsfpispse 1433 




irch completed: January 22, 2001, 12:17:58 
d time: 1635 sec 
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30 


517 ' 


7.1 1907 


S50893 


protein • tyros ine- p 




GenCore version 4.5 


31 


516.5 


7.1 1091 


S01998 


contactin precurso 




Copyright (c) 1993 ■ 2000 Compugen Ltd. 


32 


511.5 


7.0 1260 


S05479 


neural cell adhesi 






33 


509 


7.0 1257 


A41060 


neural cell adhesi 






34 


497.5 


6.8 1259 


S36126 


neural cell adhesi 


OM protein - 


protein search, using sw model 


35 


495.5 


6.8 1897 


TDHULK 


leukocyte antigen* 






36 


487 


6.7 1894 


C54689 


protein"tyrosine*p 


Run on: 


January 22, 2001, 12:24:13 ; Search time 325.28 Seconds 


37 


486 


6.7 1898 


S46216 


leukocyte antigen - 




(without alignments) 


38 


486 


6.7 5175 


T20992 


hypothetical prote 




288.277 Million cell updates/sec 


39 


485.5 


6.7 1912 


A56178 


protein-tyrosine-p 






40 


484 


6.7 5198 


T43290 


hemicentin precurs 


Title: 


OS-09-540-245A-16 


41 


478 


6.6 2783 


T34416 


hypothetical prote 


Perfect scor 


e: 7272 


42 


476.5 


6.6 1265 


L A37967 


neural cell adhesi 


Sequence: 


1 GENPRIIEHPMDTTVPKNDP RSLLSNSGSGTSSQPAGHNV 1381 


43 


475.5 


6.5 2222 


T13924 


sdk protein - frui 






44 


475 


6.5 1256 


T03096 


CDO protein - rat 


Scoring table: BLQSDM62 


45 


461.5 


6.3 1863 


S46217 


protein-tyrosine-p 




Gapop 10,0 , Gapext 0.5 












parched: 


195891 seqs, 67900655 residues 








ALIGNMENTS 





Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



195891 



PIR_66 : * 
pirl:* 
pir2:* 
pir3:* 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



% 

Query 



NO, 


Score 


Match Length 


DB 


ID 


Description 




















Query Match 20.5%; Score 1489.5; DB 2; Length 1612; 


1 


1489.5 


20.5 


1612 


2 


T30805 


duttl protein ■ mo 


Best Local Similarity 28.9%; Pred. No. 2.8e-72; 


2 


1455,5 


20.0 


1651 


2 


T14160 


transmembrane rece 


Matches 402; Conservative 231; Mismatches 516; Indels 243; Gaps 


3 


1361.5 


18.7 


1273 


2 


T42405 


sax-3 protein ■ Ca 






4 


1256 


17,3 


1344 


2 


T14316 


rig-1 protein • mo 


Qy 


4 PRIIEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTG- ■ -SHRIMLPAGGL 60 


' 5 


717 


9.9 


423 


2 


T29549 


hypothetical prote 




111:111 1 I 1 :l 1 11:111 11111:1:1 1 ::|| l!l::||:| 1 


6 


658 


9.0 


874 


2 


T29548 


hypothetical prote 


Db 


29 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 88 


7 


602 


8.3 


1028 


2 


158164 


BIG-1 protein - ra 






8 


595.5 


8.2 


1427 


2 


151669 


tumor suppressor ■ 


Qy 


61 FFLKVIHSRR-ESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVA 119 


9 


593 


8.2 


1232 


2 


T43027 


neural cell adhesi 




ll|:::| h 1 1 1 1 Ul 1 1 1 1 1 : 1 : 1 1 : 1 1 1 : 1 1 1" II II 1 


10 


581 


8.0 


1028 


2 


A53449 


plasmacytoma-assoc 


Db 


89 FFLRIVHGRKSRPDEGVYICVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 148 


11 


579,5 


8.0 


1277 


2 


T30532 


neural cell adhesi 






12 


568 


7.8 


1036 


2 


S22383 


axonin 1 precursor 


Qy 


120 LMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRI-VDGGNLAIQEARQSDDGRYQCWKN 178 


13 


566 


7.8 


1259 


2 


A43425 


Bravo/Nr-CAM cell 




:!!! Ill III MM: |: :| II : II M Mil |:| 1 I 


14 


565.5 


7,8 


1272 


2 


S26180 


neurofascin - chic 


Db 


149 VMECQPPRGHPEPTI SWKKDG SPLD - - -DKDERITIRGGKLMITYTRKSDAGKYVCVGTN 205 


15 


560 


7,7 


1443 


2 


150600 


neogenin - chicken 






16 


558.5 


7.7 


1268 


1 


A39640 


neural cell adhesi 


Qy 


179 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 


1? 


551.5 


7.6 


1447 


2 


A54100 


tumor suppressor p 




:M III 1 1 1 II :: 1 1 II |:| |||:| III: 1 : 


18 


544.5 


7.5 


1040 


2 


A49356 


transient axonal g 


Db 


206 MVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFRCEARGDPVPTVRWRK--DDGEL 263 


19 


543.5 


7.5* 


1018 


2 


JC4211 


neural adhesion pr 






20 


543 


7,5 


1197 


2 


T30581 


neural cell adhesi 


Qy 


239 PLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPK 298 


21 


540.5 


7,4 


1040 


2 


A34695 


axonal glycoprotei 


1 1 : : :l|: II III III |:| II |: III II 


22 


538 


7.4 


1896 


2 


T08851 


Down syndrome cell 


Db 


264 P KSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPH 313 


23 


536.5 


7.4 


1018 


2 


A54744 


contactin 1 precur 






24 


531.5 


7,3 


1020 


2 


S05944 


neuronal cell surf 


Qy 


299 FVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLL---PGYRDGRMEVTLTPE 355 


25 


529 


7.3 


1239 


1 


A32579 


neuroglian ■ fruit 




ll::|::|:| :| 1 |:|:| |:|:| ::| II: :|l 1 1 I: 1 : 


26 


525.5 


7,2 


6642 


2 


T29757 


protein 0KC-89 • C 


Db 


314 FWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGD 373 


27 


524.5 


7.2 


1021 


2 


A57112 


contactin precurso 






28 


521 


7.2 


2029 


1 


TDFFLK 


protein-tyrosine-p 


Qy 


356 GRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSV-DTQFELPPPIIEQGPVNQTLP 414 


29 


517.5 


7,1 


1010 


2 


JU0094 


Fll protein precur 




hi '111:1 II II: :: : 1 1 : MM lllllll: 



RESULT 1 
T30805 

duttl protein - mouse 

N; Alternate names: transmembrane receptor protein Robol homolog 
C; Species: Mus musculus (house mouse) 

C;Date: 22-Oct*1999 *sequence_revision 22-Oct-1999 itext_change 22-Oct-1999 
C;Accession: T30805 

R;Wu, M.C.; Lowe, N.; Fordham, R, ; Rabbitts, P. 
submitted to the EMBL Data Library, July 1998 

A;Description: The mouse homologue of human DUTTl/H-robol gene: protein sequence 
A; Reference number: Z20879 
Accession: T30805 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A;Residues: 1-1612 <WUM> 

A;Cross-references: EMBL:Y17793; NID;el329712; PID:el329713; PIDN:CAA76850.1 

A; Experimental source: brain 

C; Genetics: 

A;Gene: duttl 

A; Map position: 16 ( 
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Db 374 — LT ITNVQRSDVGYY I -CQTLNVA6S I ITKAYLEVTDVIADRPPPVIRQGPVNQTVA 428 

Qy 415 VKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTC 474 

I ::M Ml : I II: : I: : I ::| M : I I III 

Db 429 VDGTLILSCVATGSPAPTILWRKDGVLVSTQDSRIKQL-ESGVLQIR-YAKLGDTGRYTC 486 

Qy 475 VASNRNGKSSWSGYLRLD TPTNPNIKFFRAPELSTYPGPPGKPQMVEKGEN 525 

II :|:::|| I: : ||:||: I I ||:: : :| 

Db 487 T AST PSGEATWSAY IEVQEFG VPVQPPRPTDPNL IPSAPSKPEVTDVSKN 536 

Qy 526 SVTLSWTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTTPTQTGLLPGVNYPPLI 585 

Mllll : I h: hll I I |: II II I I II; 
Db 537 TVTLSW-QPNLNSGATPTSYIIEAFSHASGSSWQTAAENVKTETFAIKGLKPNAIYLFLV 595 

Qy 586 RAENSHGLSLPSPMSEPI-TVGTRYFNSGLDLSEARASLLSGDW-ELSNASWDSTSMK 643 

II I::|:| II :|:|: I : |:| : : I hll I I ::: hh: 

Db 596 RAANAYGISDPSQISDPVKTQDVPPTSQGVDHKQVQREL--GNWLHLHNPTILSSSSVE 653 

Qy 644 LTWQI - INGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALIST 702 
: I : :|::|: : I 

10b 654 VHWTVDQQSQYIQGYKILYR 673 
K 703 KPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVP 762 

; II: || || :| I I: II I 

Db 674 PSGASHGESEWLVFEVRTP--TK NSWIPDLRKGVNYEIKARP 714 

Qy 763 FYRSVEGRPSNSRIARTLEDVPSEAPYGMEALLL-NSSAVFLKWKAPELKDRHGVLLNY 820 

I: :l I : hill: III: I :|: : h I ::|:: I 

Db 715 FFNEFQGADSEIKFAKTLEEAPSAPPRSVTVSKNDGNGTAILVTWQPPPEDTQNGMVQEY 774 

Qy 821 HVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPAT 880 

I I ;l ;; I hi :: ::h ;l h hi III II I I 

Db 775 KVWCLGNETKYHI NKTVDGSTFSWIPSLVPGIRYSVEVAASTGAGPGVKSEPQF 829 

Qy 881 LRLDPITKRLDP--FINQRDHVNDVLTQPWFIILWAILAVLMLSFGAMVFVKRKHMMMK 938 

::|| : I :: ::||: II II :|| :::: I :: :| : 
Db 830 IQLDSHG NPVSPEDQVSLAQQ I SDWRQP AF I AG I G AACWI ILMVFS I WLY — RHRKKR 886 

Qy 939 QSALNTMSGMHTSDVLKMPSLS ARNGNGYWLDSSTGGMVWRPSPGGDSLEMQK 991 

:| I : hll : I I 1:11 II I : 

Db 887 NGLTSTYAG IRKVPSFTFTPTVTYQRGGEAV— -SSGG— RPGLLNISEPATQ 934 

Qy 992 DHIADYAPVCGAPGSPAGGGTSSGGSGGAGSG-- ASGGDDIHGGHGSERNQQRYV— 1044 

; hi ; I; : I I : hi : 

Db 935 PWLADTWPNTGNNHNDCSINCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNKQTNLMLP 994 

Qy 1045 GEYSNIPTDYAEVSSFGKAPSEYGRHGNAS - -PAPYATSSILSPH 1087 

h :: h :| : II .11 I lllh :: : 
Db 995 ESTVYGDV-DLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQANLSNNMNNGAG 1053 

By 1088 -QQQQQQQPRYQQRPVPGYGLQRPMHPHYQQQQHQQQQAQQTHQQHQALQQHQQLPPSNI 1146 
W :: :l lh I 'h : : :: ::| : :lh 

Db 1054 DSSEKHWKPPGQQK PEV AP I Q YNIMEQNKLNKDYRA — NDT I PPT I P 1098 

Qy 1147 YQQMSTTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIHITENKLSNCHTYEAAPGAKQSSP 1206 

II I III I I h :: I M II 

Db 1099 YNQS YDQNTGGS YNSSDRGSSTSGSQGHKKGARTPKA—PKQGGM 1141 

Qy 1207 ISSQFASVRRQQLPPNCSIGRESARFKVLNTDQGKNQQNLLDLDGSSMCY 1256 

: ||: : | : :: |: :|: : : I 

Db 1142 NWADLLPPPPAHPPPHSN — SEEY N - MSVDESYDQEMPCPVPPAPMYLQQDELQEEED 1196 

Qy 1257 - NGLADSGCGGSPSPMAMLMSHEDEHALY HTADGDLDDMERLYVKVDEQQPPQQQQQLI P 1315 

I hill: II: I I h:| I 

Db 1197 ERGPTPPVRGAASSPAAVSYSHQSTATL TPSPQEELQP 1234 

Qy 1316 LVPQHPAE-GHL 1326 

:: | : ||: 
Db 1*235 MLQDCPEDLGHM 1246 



transmembrane receptor protein Robol - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 20-Sep-1999 #sequence_revision 20-Sep-1999 #text_change 20-Sep-1999 
C'Accession: T14160 '. 

R;Kidd, T.; Brose, K,; Mitchell, K.J.; Fetter, R.D.; Tessier-Lavigne, M. ; Goodman, C. 
Cell 92, 205-215, 1998 

A; Title: Roundabout controls axon crossing of the CNS midline and defines a novel sub 
A;Reference number: Z17897; MUID:98117249 
A;Accesslon: T14160-, 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: raRNA 
A;Residues: 1-1651 <KID> 

A; Cross -references: EMBL:AF041082; NID:g2811215; PlD:g2811216; PIDN:AAC3 9960.1 
C; Function: 

A; Description: appears to function as the gatekeeper controlling midline crossing 
C; Keywords: transmembrane protein 



Query Match . 20.0%; Score 1455.5; DB 2; Length 1651; 
Best Local Similarity 28.8%; Pred. No. 2e-70; 

Matches 406; Conservative 229; Mismatches 525; Indels 251; Gaps 45; 

Qy 4 PRIIEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTG- • -SHRIMLPAGGL 60 

llhlll I I I :l I Ihlll llllhhl I ::ll llh:lhl I 
Db 68 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 127 

Qy 61 FFLKVIBSRR-ESDAGTYWCEAKNEFGVARSRNATLQVAYLRDEFRLEPANTRVAQGEVA 119 

llh:;| h I I I I hi I I 1 1 : ] : 1 1 : 1 1 1 : 1 1 h: II II I 
Db 128 FFLRIVHGRKSRPDEGVYICVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 187 

Qy 120 LMECGAPRGSPEPQISWRRNGQTLNLVGNKRIRI-VDGGNLAIQEARQSDDGRYQCWKN 178 

;lll III ill llhhl h ;l 11:1111 hll hi II I 
Db 188 VMECQPPRGHPEPTISWKRDGSPLD— DKDERITIRGGKLMITYTRKSDAGKYVCVGTN 244 

Qy 179 WGTRESATAFLKVH7RPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVIWRRTASGGNM 238 

;M III I : I II :: I I II hi llhl lh I : 
Db 245 MVGERESKVADVTVLERPSFVRRPSNLAVTVDDSAEFKCEARGDPVPTFGWRR--DDGEL 3C2 

Qy 239 PLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPK 299 

I ' : I : : I : 1 1 : 1 1 1 1 1 1 1 1 I : I 1 1 h 1 1 II I 

Db 303 P KSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPH 352 

Qy 299 FVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLL- - 'PGYRDGRMEVTLTPE 355 

!;;|::|;| ; I |:|:| |:hl ::| lh ;l I I h I : 
Db 353 FWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGD 412 

Qy 356 GRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSV-DTQFELPPPIIEQGPVHQTLP 414 

:: M I : I II II: :; ; I I : |||:| |||||||: 
Db 413 LTVTNVQRSDVGYYI-CQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVA 467 

Qy 415 VKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTC 474 

I : M h||| : | ||: : |: : I ::| I I : I I III 

Db 468 VDGTLTLSCVATGSPVPTILWRKDGVLVSTQDSRIRQL-ESGVLQIR-YAKLGDTGRYTC 525 

Qy 475 VASNRNGKSSWSGYLRLD TPTNPNIRFFRAPELSTYPGPPGRPQMVERGEN 525 

II : I : : : r I h : 1 1 : 1 1 : I I lh: : :| 

Db 526 TASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNL IPSAPSKPEVTDVSKN 575 

Qy 526 SVTLSWTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQIGLLPGVNYFFLI 585 

;|l I : I h: hll I I I I; II II I I lh 
Db 576 TVTLLW-QPNLNSGATPTSYIIEAFSHASGSSWQTVAENVKTETFAIKGLKPNAIYLFLV 634 

Qy 586 RAENSHGLSLPSPMSEPI-TVGTRYFNSGLDLSEARASLLSGDW-ELSNASWDSTSMK 643 

II I : : I : I II :hh I hi : : I hll I I ::: |:|:: 

Db 635 RAANAYGISDPSQISDPVRTQDVPPTTQGVDHKQVQREL--GNWLHLHNPTILSSSSVE 692 

Qy 644 LTWQI-INGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALIST 702 

: I : :h:h : I 
Db 693 VHWTVDQQSQYIQGYKILYR 712 

Qy 703 KPNIAAAGKRPGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVP 762 

; I: IMI :| I I : II I 

Db 713 PSGASHGESEWLVFEVRTP--TR NSWIPDLRKGVNYEIKARP 753 
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Qy 763 FYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLL--NSSAVFLKWKAPELKDRHGVLLNY 820 

I: :! I : hill: III: I :|: : |: I ::|:: I 
Db 754 FFNEFQGADSEIKFAKTLEERPSAPPRSVTVSKNDGNGTAILVTWQPPPEDTQNGMVQEY 813 

Qy 821 HVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPAT 880 

I I :| :: I |:| :: ::|: I h |:| III II I I 
Db 814 KVWCLGNETRYHI NKTVDGSTFSVVIPFLVPGIRYSVEVAASTGAGPGVKSEPQF 868 

Qy 881 LRLDPITKRLDP-'FINQRDHVNDVLTQPWFIILLGAILAVLMLSFGAMVFVKRRHMMMK 938 

::|| : I :: ::||: II II :|| :::: I :: II I 
Db 869 IQLDSHGNPVSPEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIWLYRHRK K 924 

Qy 939 QSALNTMRGNHTSDVLKMPSLS ARNGNGYWLDSSTGGMVWRPSPGGDSLEMQK 991 

:: I:: : : 1 = 11 : I I IMI II I : 

Db 925 RNGLSST — YAGIRKVPSFTFTPTVTYQRGGEAV — SSGG — RPGLLNISEPATQ 973 



992 DHIADYAPVCGAPGSPAGGGTSSGGSGGAGSG- - - -ASGGDDIHGGHGSERNQQRYV- - - 1044 

:|| I I : • : :l : I : I I : hi : 
974 PWLADTWPNTGNSHNDCSINCCTASNGNSDSNLTTYSRPADCIANYNNQLDNKQTNLMLP 1033 

1045 GEYSNIPTDYAEVSSFGKAPSEYGRHGNAS-PAPYATSSILSPHQQQQQQQPRY 1097 

I: :: I: :| : II I I I Mil: :: : 
1034 ESTVYGDV-DLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQANLINNMNN — 1089 



Qy 1098 QQRPVPGYGLQRPMHPHYQQQQHQQQQAQQTHQQHQALQQHQQLPPSNIYQQMSHSEIY 1157 

II- ::| : || :| | I: : : : : I 
Db 1090 GGG DSSEKHWKPPGQQ--KQEVAPIQYNIMEQNKLNKDYRANDTIL 1133 

Qy 1158 PTNTGPSRSVYSEQYYYPKDKQRHIHITENKLSNCHTYEAAPGAKQSSPISSQFASVRRQ 1217 

• II II hi: I 

Db 1134 PT IPYN HSYDQNTGGSYNS- -SDRGSSTSGS 1162 

Qy 1218 QLPPNCSIG-RESARFKVLNTDQGKNQQNLLDLDGSSMCYNGLADSGCGGSPSPMAMLMS 1276 

I I :: II I I :|| III 
Db 1163 Q GHKKGARTPKAPKQGGMNWADLL PPPPAHPPP 1195 

Qy 1277 HEDEHALYHTADGDLDD MERLYVKVDEQQPPQQQQQLIPLVPQHPAEGHLQSW 1329 

I : : I I hi:: I : : :: | ! : |: 

Db 1196 HSNSEEYSMSVDESYDQEMPCPVPPARMYLQQDELEEEEAERGPTPPVRGAASSPAAVSY 1255 

Qy 1330 RNQSTRSSRKNGQECIKE PSELIYAP 1355 

:l!l : : II :: I :| : I 
Db 1256 SHQSTATLTPSPQEELQPMLQDCPEDLGHMP 1286 

•OLT 3 
405 

sax-3 protein • Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 03-Dec-1999 tsequence_revision 03-Dec-1999 *text_change 21-M-2Q0O 
C; Accession: T42405 

R;Zallen, J. A.; Yi, B.A.; Bargntann, C.I, 
Cell 92, 217-227, 1998 
A; Title: The conserved immunoglobulin superfamily member SAX-3/Robo directs multiple asf 
A;Reference number: Z22160; MUID:98117250 
A; Accession: T42405 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1273 <ZAL> 

A;Cross-references: EMBL:AF041053; NID:g2804779; PIDN:AAC38848,1; PID: 
C; Genetics: 
A; Note: sax-3 
C; Function: 

A; Description; sax-3 function is required at the time of axon guidance 



Query Match 18.7%; Score 1361.5; DB 2; Length 1273; 

Best Local Similarity 28.4%; Pred. No. 1.6e-65; 

Matches 357; Conservative 202; Mismatches 437; Indels 259; Gaps 37; 



Qy 



4 PRIIEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGREL— KTDTGSHRIMLPAGGL 60 

I llllhl I : I I III: : I I hllh : I lllhl I I 



Db 31 PVIIEHPIDVWSRGSPATLNCGAKPS-TAKITWYKDGQPVITNKEQVNSHRIVLDTGSL 89 

Qy 61 FFLKVIHSR-RESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEV 118 

I III : :':IHI hi I II I :l :|::hlh:lh I : lh 
Db 90 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGEM 149 



.19 ALMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWRN 178 

|::|| 111:111 Mill: : I : I : III I :|| I llll I 
.50 AVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 209 

.79 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

Ml I I I I M : h: I lh:|:| lh III I : hi I 
210 MVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKR--KNEPM 267 

39 PLRKFSWLHSASGRVHVLED-RSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPP 297 

h :: :| I I::: I 1 1 1 I I I I : I : I I 1 1 1 

!68 PVT "RAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPP 317 



Qy 298 KFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGY--RDGRMEVTLTPE 355 

I :l :l J" I III I I I Ml II llll III :h I 
Db 318 SFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPI- 375 



Qy 356 GRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQF ELPPPIIEQGPV 409 

hi ': I I I I :h II h : I h III II I 

Db 376 "GTLTIEEVRQVDEGAYV-CAGMNSAGSSLSKAALKVTTKAVTGNTPAKPPPTIERGHQ 432 



10 NQTLPVKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDE 469 
llll I I :llh I II Ml Ihllh : I : hi hlh: I 
Db 433 NQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITD-SRISQHSTGSLHIADLKK-PDT 490 



Qy 470 GLYTCVASNRNGKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTL 529 

1:111:1 I 'hhll I : : h I : I I h I : I I : I : I : I I 
Db 491 GVYTCIAKNEDGESTWSASLTVEDHTS-NAQFVRMPDPSNFPSSPTQPIIVNVTDTEVEL 549 



Qy 530 SWTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAEN 589 

I : I . : Ihh : : I : I :| : II I :| hlllll 
Db 550 HWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGLKPSHSYMFVIRAEN 609 



Qy 590 SHGLSLPSPMSEPITVGTRYFNSGL DLSEARASLLSGDWELSNASWDST SMK 643 

I: II I M I h: I I I :::| ::||::: 

Db 610 EKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEEVKTINSTAVR 669 



Qy 644 LTWQIIN-GRYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALIST 702 

I I: : ::hh I I :| I : II I 
Db 670 LFWKKRKLEELIDGYY IKWR GPPRTNDNQYVNVTSPS 706 



Qy 703 KPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVP 762 

/ • : :: h :l IHhM 

Db 707 " TENYWSNLMPFTNYEFFVIP 727 



'53 FYK— SVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDRHGVLLN 819 
:: h UNI Mill: :|| : : : Mil: MM 
Db 728 YHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTILRISWKAPKADGINGILKG 787 



Qy 820 YHVIVRG IDT AHNFSRILT NVT IDAASPTLVIANLTEGVMYTVGVAAGNNAGVG PY - -CV 877 

: ::: I l' I M hi : : :: I M h I : III M III 
Db 788 FQIVIVG-QAPNNNR— NITTNERAASVTLFHLVTGMTYKIRVAARSNGGVGVSHGTS 842 

Qy 878 PATLRLDP I T KRL DPFIN- - -QRDHVNDVLIQPWFIILLGAILAVLMLSFGAMV 928 

: I : II : h : II M:: III : :: I 

Db 843 EVIMNQDTLEKHLAAQQENESFLYGLINKSHVP YIVIVAILIIFWIIIAYC 894 

Qy 929 FVK -,-RKHMMMKQSALNTMRGNHTSDVLKMPS LSARNGNGYWL 969 

: : ' ; | : : ::: I h II : h :: I II I 

Db 895 YWRNSRNSDGSDRSFIKINDGSVH-MASNNLWDVAQNPNQNPMYNTAGRMTMNNRNGQAL 953 

Qy 970 DSST -' GGMVWRPSPGGDSLEMQKDHIADYAPVCGAPGSPAGGGTSSGG 1016 

II ? ■ I : II M II : I lh 
Db 954 YSLTPNAQDFENNCDDYSGTMHRPG SEHHYHYAQLTGGPGN 994 

Qy 1017 SGGAGSGASGGDDIHGGHGSERNQQRYVGEYSNIPTDYAEVSSFGKAPSEYGROASPA 1076 

MM II : I: 
Db 995 '- AMSTF YGNQYHDDPS 1009 
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Qy 1077 PYATSSILSPHQQ QQQQQPRYQQRPVPGYGLQRPMHP HYQQQQHQQQQA 1125 

llll:::: ;|| : :| ||| | | | ;: ; :| 
Db 1010 PYA.TTTLVLSNQQPAWLNDKMLRAPAMPTNPVP PEPPARYADHTAGRRSRSSRA 1063 

Oy 1126 QQ THQQHQALQQHQOLPPSNI-YQQMSTTSEIYPTNTGPSRSVYSEQ 1171 

I : I: I ::: I I: :: II I: h 
Db 1064 SDGRGTLNGGLHHRTSGSQRSDSPPHTDVSYVQLHSSD GTGSSKERIGER 1113 



RESULT 4 
T14316 

rig-1 protein - mouse 

C; Species: Mus musculus (house mouse) 

C;Date; 20-Sep-1999 #sequence_revision 20-Sep-1999 »text_change 20-Sep-1999 
C; Accession: T14316 

R;Yuan, S.S.F.; Cox, L.A.; Dasika, G.K.; Lee, E.Y.H.P. 

submitted to the EMBL Data Library, April 1998 

A;Reference number: Z17975 

AjAccession: T14316 
Status: preliminary; translated from GB/EMBL/DDBJ 
^molecule type: mRNA 
■Residues: 1-1344 <YOA> 

Across -references: EMBL:AF060570; NID:g4206385; PID:g4206386; PIDN: AAD11628 .1 



Query Match 17.3%; Score 1256; DB 2; Length 1344; 

Best Local Similarity 27.2%; Pred. No. 8.3e-60; 

Matches 378; Conservative 194; Mismatches 462; Indels 356; Gaps 48; 

Qy 4 PRIIEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKT — DTGSHRIMLPAGGL 60 

III: II I : :| I hill I I |:|:|:| : I I :||::||:| I 
Db 42 PRIVEQPPDLWSRGEPATLPCRAEGRPRPNIEWYKNGARVATAREDPRAHRLLLPSGAL 101 

Oy 1 61 FFLKVIHSRR-ESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVA 119 

II :::l II 4 I I I |:| II 1 1 1 1 : 1 : 1 1 1 1 1 1 : 1 1 II II II I 
Db 102 FFPRIVHGRRSRPDEGVYTCVARNYLGAAASRNASLEVAVLRDDFRQSPGNVWAVGEPA 161 

Oy 120 LMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVDGGMLAIQEARQSDDGRYQCWKNV 179 

:IM 1:1 III ::h : I : : II I : :|| I Ml I: 
Db 162 VMECVPPKGHPEPLVTWKKG--KIKLKEEEGRITIRGGKLKMSHTFKSDAGMYMCVASNM 219 

Oy 180 VGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNMP 239 

I III III II :| I II : : I I I : III I:: II: I :l 
Db 220 AGERESGAAELWLERPSFLRRPINQWLADAPVHFLCEVQGDPQPNLHWRK-DDGELP 277 

Oy 240 LRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPKF 299 

:|| : 111:1 |: I I III |:|:|| |:| hi! Ihl 
Db 278 AGRYEIRSDHSLWIDQVSSEDEGTYTCVAENSVGRAEASGSLSVHVPPQF 327 

mi 300 VIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTPEGRSV 359 
W I :|:ll I I I 1:1: |:| I ::| II: III :| I II : 

Db 328 VTKPQNQTVAPGANVSFQCETKGNPPPAIFWQKEGSQVLLFPSQ SLQPMGRLL 380 

Qy 360 LSIARFAREDSGKWTCNALNAVGSVSSRTWSV-DTQFELPPPIIEQGPVNQT 412 

h | | | | |:: ||: :: :: : : III III III 

Db 381 VSPRGQLNITEVKIGDGGYYV-CQAVSVAGSILAKALLEIKGASIDGLPPIILQGPANQT 439 

Qy 413 LPVKSIWLPCRTLGTPVPQVSW YLDG I P IDVQEHERRNLSDAGALT I SDLQRHE 467 

I : I I llll :| I I : I :| I : : II I I I h : 
Db 440 LVLGSSVWLPCRVIGNPQPNIQWKRDERWLQG DDSQFNLMDNGTLHIASIQ-EM 492 

Qy 468 DEGLYTCVASNRNGKSSWSGYLRLDT PTNPNIKFFRAPELSTYPGPPGK 516 

I I 1:111 : l:::|: Ml Ml Mil : 

Db 493 DMGFYSCVAKSSIGEATWNSWLRKQEDWGASPGPATGPSNP PGPPSQ 539 

Qy 517 PQMVEKGENSVTLSWTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLL 576 

I : I Ihlh : I I:: llll I : Ml I hi M 
Db 540 PIVTEVTANSITLTW-KPNPQSGATATSYVIEAFSQAAGNTWRTVADGVQLETYTISGLQ 598 

Qy 577 PGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASL LSGDV 628 

I I IMI : III MMIh I MM |: 

Db 599 PNTIYLFLVRAVGAWGLSEPSPVSEPVQT QDSSLSRPAEDPWKGQRGLAEVA 650 



Qy 629 VELSNASWDSTSMKLTWQI ING ■ -KYVEGFYVYAR QLPNPIVNNPAP 674 

I : :|: ' :::::| ::| : |:|| II Ml : : 

Db 651 VRMQEPTVLGPRTLQVSW- T VDGPVQLVQGFRVSWRIAGLDQGSWTMLDLQS P- -HKQST 707 

Qy 675 VTSNTNPLLGSTSTSASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPLNTKYRML 734 

I I ' III: I I IMI I: 

Db 708 VLRGLPP GAQIQIKVQV QGQEGLGAESPFVTR — 739 

Qy 735 TILNGGGASSCTITGLVQYTLYEFFIVPFYKSVEGRPSNSRIARTLEDVPSEAPYGMEAL 794 

; :| I: II I I: 

Db 740 SIP EEAPSGPPQGVAVA 756 

Qy 795 L--LNSSAVFLKWKAPELKDRHGVLLNYHVIVRGIDIAHNFSRILTNVTIDAASPTLVLA 852 

I Ml : I: I IMI: Ml I II M : :: : 
Db 757 LGGDRNSSVTVSWEPPLPSQRNGVITEYQIWCLG NESRFHLNRSAAGWARSVTFS 811 

Qy 853 NLTEGVMYTVGVAAGNNAGVGPYCVPATLRLD-PITKRLDPFINQ-RDHVNDVLTQPWF 909 

I I :! 'Ill Mh I M I I ::: : : IMI 

Db 812 GLLPGQIYRALVAAATSAGVGVASAPVLVQLPFPPAAEPGPEVSEGLAERLAKVLRKPAF 871 

Qy 910 IILLGAILAVliMLSFGAMVFVKRKHMMMKQSALNTMRGNHTSDVLKMPSLSARNGNGY" 967 

: I |:l I M: ::| :: I: ::|: |::| : I 
Db 872 LAGSSAACGALLLGFCAALYRRQK QRKELS — HYTASFAYTPAVSFPHSEGLSG 923 

Qy 968 - — WLDSSTGGMVWRPSPGGDSLEMQKDHIADYAPVCGAPGSPAGGGT 1012 

. Ill IIM: Ml 
Db 924 SSSRPPMGLGPAAYPWLADS WPHPPRSPSAQEPR GSCCPSNPDPDDRYY 972 

Qy 1013 SSGG SGGAGSGASG GDDIHGGHGSERNQQRYVGEYSNIPTDYA 1055 

: I I: III I::: II : I |: :: 
Db 973 NEAGISLYLAQTARGANASGEGPVYSTIDPVGEELQTFHGG FPQHSSGDPSTWS 1026 

Qy 1056 EVSSFGKAPSEY GRHG NASPAPYATSSILSPHQQQQQ 1092 

: II I: I I | | | : : | ::: 

Db 1027 QY APPEWSEGDSGARGGQGKLLGKPVQMPSLSWPEALPPPPPSCELSCPEGPEEE 1081 

Qy 1093 QQ PRYQQRPVPGYGLQRPMHPHYQQQQH 1120 

: II I II 
Db 1082 LKGSSDLEEWCPPVPEKSHLVGSSSSGACMVAPAPRDTPSPTSSYG 1127 

Qy 1121 QQQQAQQTHQQHQALQQHQQLPPSNI • - YQQMSTTSEI YPTNTGPSRSVYSEQYYYPKDK 1178 

II I I . I h: II : III : I 

Db 1128 QQSTATLTPSPPDPPQ PPTDIPHLHQMPRRVPL GPSSPLSVSQPALSSHD 1177 

Qy 1179 QRHIHITENKLSNCH TYEAAPG AKQSSPISSQFASVRRQ 1217 

M : : : I I Ml : M: I M: 

Db 1178 GRPVGLGAGPVLSYHASPSPVPSTASSAPGRTRQVTGEMTPPLHGHRARIRKRPKALPYR 1237 

Qy 1218 QLPP-1221 

III 

Db 1238 REHSPGDLPP 1247 



RESULT 5 
T29549 

hypothetical protein ZK377.3 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: lS-Oct-1999 frsequence_revision 15-Oct-1999 fttext.change 18-Feb-2000 
C; Accession: T29549 
R;Nhan, M, ; Hawkins, ;j. 

submitted to the EMBL Data Library, February 1997 
A; Description: The sequence of C. elegans cosmid ZK377. 
AjReference number: 220639 
AjAccession: T29549 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;Residues: 1-423 <NHA> 

A;Cross-references: EMBL:U88183; PIDN:AAB52658.1; GSPDB:GN00028; CESP:ZK377.3 

A; Experimental source: strain Bristol N2; clone ZK377 

C; Genetics; 

A;Gene: CESP:ZK377.3 

A; Map position: X 
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Ajlntrons: 24/1; 142/3: 229/3; 284/2; 408/3 



Query Match 9,9*; Score 717; DB 2; Length 423; 

Best Local Similarity 39.1%; Pred. No. 1.7e-31; 

Matches 155; Conservative 61; Mismatches 154; Indels 26; Gaps 9; 

Qy 4 PRIIEHPMDTTVPKMDPFTFNCQAEGNPTPTIQWFKDGREL" ■ -KTDTGSHRIMLPAGGL 60 

I llllhl I : I I III: : I I hill: : I lllhl I I 
Db 30 PVIIEHPIDVWSRGSPATLNCGAKPS-TAKITWYKDGQPVITNKEQVNSHRIVLDTGSL 88 

Qy 61 FFLRVIHSR--RESDAGTYWCEMNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEV 118 

I III : ::IHI hi I II I :l :h:|:|h:|h I : lh 
Db 89 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKIAMLREDFRVRPRTVQALGGEM 148 



119 ALMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYOCWKN 178 

h:M III III :lllh : I : I : III I :|| I Mil I 
149 AVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYOCVANN 208 

179 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

:!! I I I I I : : h: I ||::|:| lh III I : h I 
209 MVGERVSNPARLSVFEKPRFEQEPRDMTVDVGAAVLFDCRVTGDPQPQITWRR- ■ RNEPM 266 

239 PLRRFSWLHSASGRVHVLED-RSLRLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPP 297 

h I :: :| I |::: I I 1 1 1 I I I I : h I I 1 1 1 

267 PVT RAYIARDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPP 316 

298 KFVIRPRKQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGY-RDGRMEVTLTPE 355 

I :| :| I I III II I :ll II II I I III :|: I 
317 SFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRIKVSPT" 374 



Qy 356 GRSVISIARFAREDSGRWTCNALNAVGSVSSRTW 391 

hi : I I I I :h II |: : 
Db 375 --GTLTIEEVRQVDEGAYV-CAGMNSAGSSLSKAAL 407 



RESULT 6 
T29548 

hypothetical protein ZK377.2 • Caenorhabditis elegans 
C; Species: Caenorhabditis elegans | 

C;Date: 15-Oct-1999 tsequence_revision 15-Oct-1999 #text_change 18-Feb-2000 
C; Accession: T29548 
■ R;Nhan, M.; Hawkins, J. 

submitted to the EMBL Data Library, February 1997 
A; Description: The sequence of C. elegans cosmid ZK377. 
Reference number: Z20639 
Accession: T29548 
H| Status: preliminary; translated from GB/EMBL/DDBJ 
^; Molecule type; DNA 
AjResidues: 1-874 <NHA> 

A; Cross -references: EMBL:088183; PIDN:AAB52657.1; GSPDB:GN00028; CESP:ZK377 .2 

A; Experimental source: strain Bristol N2; clone ZR377 

C; Genetics: 

A; Gene; CESP:ZK377.2 

A; Map position: X 

A;Introns: 91/2; 356/1; 452/1; 701/3; 746/3; 850/1 



Query Match 9.04; Score 658; DB 2; Length 874; 

Best Local Similarity 22.9%; Pred. No. 7.2e-28; 

Matches 229; Conservative 154; Mismatches 329; Indels 288; Gaps 

Qy ' 400 PPPIIEQGPVNQTLPVKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALT 459 

III II I INI I I : 1 1 1 : 111:11 1 1 : 1 1 1 : : I : hi 
Db 28 PPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITD-SRISQHSTGSLH 86 

Qy 460 ISDLQRHEDEGLYTCVASNRNGKSSWSGYLRLDTPTNPNIRFFRAPELSTYPGPPGKPQM 519 

1 : 1 1 : : I hllhl I : 1 : 1 : 1 1 I :: h I :| I h I :| I :| : 
Db 87 I ADLKK - PDTGVYTC I AKNEDGESTWSASLT VEDHTS ■ NAQFVRMPDPSNFPSSPTQPI I 144 

Qy 520 VEKGENSVTLSWTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTIFTQTGLLPGV 579 

I : I I I : I : Ihh : : I : I :| : II I 
Db 145 VNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIRGLKPSH 204 



Qy 580 NYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGL DLSEARASLLSGDWELSN 633 

:| hlllll. I: II I :| I :: I I :::| 

Db 205 SYMFVIRAENEKGIGTPSVSSALVTTSRPAAQVALSDRNKMDMAIAERRLISEQLIKLEE 264 

Qy 634 ASWDSTSMKLTWQI I N - GKYVEGFYVY ARQLPNP I VNNPAPVT S NT NPLLG STSTSASA 692 

::lh::.l h : ::|:|: I I :| I : II I 

Db 265 VRTINSTAVRLFWKKRRLEELIDGYYIKWR GPPRTNDNQYVNVTSPS - - - 311 

Qy 693 SASASALISTRPNIAAAGRRDGETNQSGGGAPTPLNTRYRMLTILNGGGASSCTITGLVQ 752 

Db 312 TENYWSNLMP 322 

Qy 753 YTLYEFFIVPFYR— SVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLRWRAPE 809 

: llll::!:: h I llll Mill: :|| : : : lllh 
Db 323 FTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWRAPR 382 

Qy 810 LRDRHGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNN 869 

,:h'l <: ::: I II :| hi : : :: I :| h I : III :| 
Db 383 ADGINGILRGFQIVIVG-QAPNNNR— NITTNERAASVTLFHLVTGMTYRIRVAARSN 437 

Qy 870 AGVGPY- -CVPATLRLDPITKRL DPFIN— QRDHVNDVLTQPWFIILLGAILA 918 

III ' : I : I I : h : II :|:: III 

Db 438 GGVGVSHGTSEVIMNQDTLEKHLAAQQENESFLYGLINRSHVP VIVIVAILI 489 

Qy 919 VLMLSFGAMVFVR RRHMMMKQSALNTMRGNHTSDVLRMPS L 959 

: :: I : : I : : ::: I |: || : I : : 

Db 490 IFWIIIAYCYWRNSRNSDGRDRSFIRINDGSVH-MASNNLWDVAQNPNQNPMYNTAGRM 548 

Qy 960 SARNGNGYWLDSST GGMVWRPSPGGDSLEMQRDHIADYAPVCGAPGS 1006 

: I II I I I I : II :| II : I lh 

Db 549 TMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPG SEBHYHYAQLTGGPGN 599 

Qy 1007 PAGGGTSSGGSGGAGSGASGGDDIHGGHGSERNQQRYVGEYSNIPTDYAEVSSFGKAPSE 1066 

: I: I 

Db 600 ; AMSTF 604 

Qy 1067 YGRHGNASPAPYATSSILSPHQQ QQQQQPRYQQRPVPGYGLQRPMHP HY 1113 

II : hlllh::: HI : : I III III 
Db 605 YGNQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPTNPVP PEPPARYADHT 658 

Qy 1116 QQQQHQQQQAQQ THQQHQALQQHQQLPPSNI-YQQMSTTSEIYPTNTGPSRS 11 6C 

:: : :| I : h I ::: I I: :: II h 
Db 659 AGRRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTDVSYVQLHSSD GTGSSRE 713 

Qy 1167 VYSEQY YYPKDKQRH I H ITENKL SN CHTYEAAPGA — KQSSPISS 1209 

I: I ■' II II lh I : hi 

Db 714 RTGERRTPP NRTLMDFIPPPPSNPPPPGGHVYDTATRRQLNRGSTPRED 762 

Qy 1210 QFAS r—VRRQQLPPNCSIGRESARFKVLNTDQGRNQQNLLDLDGSSMCYNGL 1259 

: I • I : I ::| : I I : ::| II I :l 
Db 763 TYDSVSDGAFARVDVNARPTSRNRNLGGRPLRGR-RDDDSQRSSLMMDDDGGSSEADGE 820 



1260 ADSG .— 

I ". MM 
821 NSEGDVPRGG VRRAVPRMGISASTLA 1 



7 



1286 
I: I I 
■HSCYGT 852 



158164 

BIG-1 protein - rat ' 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 26-Jul-1996 fsequence_revision 26-M-1996 ftext_change 24-Sep-1999 
C; Access ion: 158164 

R;Yoshihara, Y.; Rawisaki, M.; Tani, A.; Tamada, A.; Nagata, S.; Ragamiyama, H. ; Mori 
Neuron 13, 415-426, 1994 

A;Title: BIG-1: a new TAG -1/F3 -related member of the immunoglobulin superfamily with 
A;Reference number: 158164; M0ID;94338697 
A; Access ion: 158164 : 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1028 <RES> 

A;Cross-references: EMBL:U11031; NID:g563132; PIDN: AAA63607 . 1; PID:g563133 



Best Available Copy 
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C; Genetics: 
A;Gene: BIG-1 

C; Super family: contactin; fibronectin type III repeat homology; immunoglobulin homology 



Query Match 8.3%; Score 602; DB 2; Length 1028; 

Best Local Similarity 22.6*; Pred. No. 9.4e-25; 

Matches 260; Conservative 160; Mismatches 423; Indels 306; Gaps 39; 

3y 2 ENPRIIEHPMDTTVP— KNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAG 58 

: I I :: I :: | I 1 1 : 1 lll:| :| :| :: I II I \ 
3b 24 QGPVFVREPSNSIFPVGSEDKRITLNCEARGNPSPHYRWQLNGSDIDTSL-DHRYRLNGG 82 



59 GLFFLKVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEV 118 

I II: I I 1:1 Ijl I I II I II I I : : I :|: 
83 NLI- - -VINPNRNWDTGSYQC| ATNSLGTIVSREAKLQFAYLENFKSRMRSRVSVREGQG 139 

119 ALMECGAPRGSPEPQISWRKNSQTLNLVGNKRIRIV-DGGNLAIQEARQSDDGRYQCW 176 

:: II I I I :| lj: : I I I : hi I : II I Mil 
140 WLLCGPPPHSGELSYAWVFtttEYPSFVEEDSRRFVSQETGHLYIAKVEPSDVGNYICW 198 

177 KNWGTRESATAFLKVHVRPfIiRG PQNOTAWGSSWFQCRIGG 221 

: I I : I J: I I: I ll:| :| I 

199 TSTV TMVLGSPTPL7LRSDGVMGEYEPRIELQFPETLPAARGSTVRLECFALG 253 



Qy 222 DPLPDVLWRRTASGGNMP— | - -LRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYT 275 

:|:| : llh II ' Nil: : |:: : II I I 

Db 254 NPVPQINWRRS- • -DGMPFPTRIRLRRFNGV LEIPNFQQEDIGSYE 296 

Qy 276 CEADNAVGGITATGILTVHAPPRFVIRPRNQLVEIGDEVLFECQANGHPRPTLYWSYEGN 335 

I IM I II II :l Ml |: : I : :||:|:| |:|: I h 
Db 297 CIAENSRGRNVARGRLTYYAKPYWVQLLRDVETAVEDSLYWECRASGRPRPSYRWLRNGD 356 

Qy 336 SSLLLPGYRDGRMEVTLTPEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDT 395 

: :l : I::: hll III : I I I I : I I : 

Db 357 ALVL EERIQIE NGALT I ANLNVSDSG -MFQCIAENRHGLI YS — SAEL 401 

Qy 396 QFELPPPIIEQGPVNQTLPVR"SIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLS 453 

: I : I: : : |:i |:|:| |: :| :|:: I : |:| I :| 
Db 402 KVLASAPDFSRNPMKRMIQVQVGSLVILDCKPSASP-RALSFWRKGDTV-VREQARISLL 459 

Qy 454 DAGALTISDLQRHEDEGLYTCVASNRNGRSSWSGYLRLDTPT 495 

Mil::: | |:|||i| |: ||:; : | : || 
Db 460 NDGGLRIMNVTK-ADAGIYTCiAENQFGRANGTTQLWTEPIRIILAPSNMDVAVGESII 518 

Qy 496 NPNIRFFR 503 

I : I: 

Db 519 LPCQVQHDPLLDIMFAWYFNGTLTDFRRDGSHFERVGGSSSGDLMIRMQLKHSGKYVCM 578 

Ry 504 APELST-YPGPPGRPQMVERGENSVTLSWTRSKKVGGSSLVGYVIEM" 549 
I II Ml :M : : Ml MM:: 
b 579 VQTGVDSVSSAAELIVRGSPGPPENVKVDEITDTTAQLSWTEGTD'SHSPVISYAVQART 637 

Qy 550 - FGK NET DGWVAVGT RVQ NTTFTQTGLLPGVNYFFLIRAENSHGISLPSPMSEPI 603 

I Mill: II M II I M I I II IM 
Db 638 PF SVGWQNVRIVPEAIDGRTRTATWELNPWVEYEFRWASNRIGGGEPSLPSERV 693 

Qy 604 TVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKLTWQIINGRYVEGFYVYARQ 663 

II : :| : : : :|| 
Db 694 RT EEAAPEVAPSEV — SGGGGSRSELVITW 721 

Qy 664 LPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTRPNIAAAGRRDGETNQSGGG- 722 

:| I I hill 

Db 722 DPVP EELQNGGGF 734 

Qy 723 — APT PLNTKYRMLT ILNGGGASSCTI ■ -TGLVQYTLYEFFIVPFYRSVEGRPSNSRI 776 

I II : I:: :| :: II : : II I 

Db 735 GYWAFRPLGVTTWIQTWTSPDNPRYVFRNESIVPFSPYEVRVGVYNNKGEGPFSPVTI 794 

Qy 777 ARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELRDRHGVLLNYHVIV— RGIDTAHNF 833 

: I: I: II M hll : M I :| II M I : : : 
Db 795 VFSAEEEPTVAPSHISAHSLSSSEIEVSWNTIPWRSSKGRLLGYEVRYWNNGGEEESSSR 854 



834 SRILINVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPITRRLDPF 893 

:: I I ; Ml I M I I Mil: : :: lh I 
855 VRVAGNQT- - ■ • • -SAVLRGLRSNLAYYTAVRAYNTAGAGPF SATVNATTRRTPP- 903 

894 INQRDHVNDVLTQPWFI I LLG AILAVLMLSF GAMVFVRRRHMMMRQS 940 

Ml :: I ::|:: MM h 

904 • SQPPGNWWNATDTRVLLNWEQVRALENESEVTGY KVFYRTS — SQN 948 

941 ALNTMRGNHTSDVLRMP SLSARNGHGYWLDSSTGGMVWRPSPGGDSLEMQRDH 993 

: : I li I :| M I I :: : I 

949 NVQVLNTNKTSAELLLPIKEDYIIEVKATTDGG- ■ -DGTSSEQIRIPRITSMDARGSTSA 1005 

994 IADYAPVCG 1002 

hi II M- 
1006 ISDIHPVSG 1014 



RESULT 8 ■ 
151669 

tumor suppressor ■ African clawed frog 

C; Species: Xenopus laevis (African clawed frog) 

C;Date: 13-Sep-1996.fsequence_revision 13-Sep-1996 ttext.change 21-Jul-2000 
C;Accession: 151669' 

RjPlerceall, W.E,; Reale, M.A,; Candia, A,F, ; Wright, C.V.; Cho, K.R.; Fearon, E.R. 
Dev. Biol. 166, 654 -665, 1994 

A; Title: Expression of a homologue of the deleted in colorectal cancer (DCC) gene in 
A;Reference number: 151668; MOID : 95113183 
A;Accession: 151669,' 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues; 1-1427 <PIE> 

A;Cross-references: EMBL:U10986; NID:g606873; PIDN:AAA70168,1; PID:g606874 

C;Genetics: 

A;Gene: XDCCa 



Query Match 8.2%; Score 595.5; DB 2; Length 1427; 

Best Local Similarity 22,3%; Pred. No. 3.3e-24; 

Matches 316; Conservative 171; Mismatches 531; Indels 399; Gaps 65, 

3y 7 IEHPMDTTVPKNDPFTFNCQAEGN-PTPTIQWFRDGRELRTDTGSHRIMLPAGGLFFLRV 65 

: I I : llh: I hi III I I Ml I 
3b 43 LSEPSDAVTMRGGNWLNCSAQSDRGAPIIRWKRDGVYLNLVIDERRQQLPSGSFLIQNV 102 

3y 66 IHS R - RESDAGT YWCEAK - NEFGVARSRNAT LQVA - VLRDEFRLEPANT RVAQGEVALME 122 

Ml M Mil : I II I Ml II M I h lh 
Db 103 VHSRHHRPDEGVYQCEASLDSVGTIVSRTARVLVAGPLRILSQTESVTAFV-GDTALLR 160 

3y 123 CGAPRGSPEPQISWRRNGQTLNLV-GNRRIRIVDGGNLAIQEARQSDDGRYQCWKNWG 181 

I I I lllhll M : IM: :: II I : :| I hh II 

3b 161 CEI-TGEPMPTISWQRNEEDLRVTPGDPRLLVLPSGTLQISRLQTADGGVYRCLAKNPGS 219 

3y 182 TRESATAFLRV HVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASG 235 

I I:: I : :: I I I: I I :| : I I I M : 
Bb 220 ARVGNEAELRILSESGLHRQQVFLQRPSNWAIEGQDAVLECAVSGYPTPTIVWMQ 275 

3y 236 GNMPL — RKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGIL 291 

h h 1 1 : 1 II :| : =11 :| Mill : : I 

Db 276 GDEPVPIRTRRYS VLGGSNLLISNVTDDDAGAYTCVATYKNENTSFSADL 325 

2y 292 TVHAPPRFVIRPRNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPG YRDGR 347 

II IM I I :: III :| I lh h I :::l II 

Db 326 TVMVPPQFLNKPANLYAYESMDIEFECAVSGRPSPTVRWT--RNGEWIPSDYFQIVDG- 382 

3y 348 MEVTLTPEGRSVLSIARFAREDSGRWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQG 407 

III Ml MM::: I :| I : 

Db 383 SNLRILGLVRSDEG - YYQCIAENEAGNIQTY AQLIIPDPAVPSS 425 

Qy 408 PVNQTLPVRSIWLPCRTLGTPVPQVSWYLDGIPID VQEHERRN 451 

: : M Ml : : Ml M II I 

Db 426 SILPSAPRDWPVL VSSRFVRLSW- - • RPPVESRGNIQTYTVYFSRQGVQRERAVN 478 

Qy 452 LSDAGAL--TISDLQRHEDEGLYTCVASNRNGRSSWSGYLRLDTPTNPNIKFFRAPELST 509 
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I :| I: :| I : II! 
479 TSQPISLQITVGNLTPEETYN-FRWAYNEWGPGE-- 



M III 
■-SSQEVKWTQPELQV 527 



510 YPGPPGRPQMVERGENSVTLSWTRSNKVGGSSLVGYVI • ■ -EMFGKNETDGWVAVGTRVQ 566 

III 1:1 II M I : II : I I I : I 
528 -PGPVENLQWSTAPTSVLISWDPPAYANG-PVQGYRLFCAETFSGREQN IEVD 579 

567 NTTFTOTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSG 626 

: II I : I I :l I II II I II: :;: 

580 GIVYRLEGLRKFTEYSIRVLAYNRYG— PGVSSEEHTWT LSDVPSAMPQN 628 

627 DWELSNASWDSTSMRLTW QIINGKYVEGFYV 659 

:h:| I MM II " I: : 

629 VSLEVAN SRS IRVSWLPPPPGTQNG "FITGYRIRHRRTTRRGELETLEPNNLWYL 682 



Qy 660 



I 



YARQLPNPI VNNPAP VTSNT- -NPLLGSTSTSASASASASALIST - - 7 0 2 

1:1:111 I: I I I I :l I :: 

683 FTGLEKGSQYSFQVAAMTVNGTGPSSDWYTAETPENDLDESQVPDQPSSLHVRPLTTSII 742 

703 KPNIAAAGKRDG ET NQS 719 

III I I II I : 

743 MSWTPPLNPNIWBGYIIGYGVGSPYAETVRVDSKQRYYSIENLEPSSHYVISLKAFNNA 802 



Qy 720 GGGAP * TPL NTKYRM 733 

■ ■ II: 1:1 

Db 803 GEGVPLYESATTRSOTVPDMSTPMLPPVGVQAVALTHDAVRVSWADNSVLKNQKTTEVRF 862 

I Qy 734 LTI--LNGGGAS SCTITGLVQYTLYEFFIVPFYKSVEGKPSN— -SRI 776 

II II I hill 1:111 :: :h h : 

Db 863 YTIRWRTSYSASSKYKSADTTSLSHTVTGLKPNTMYEFSVM — VTKGRRSSTWSMTAH 918 

Qy 777 ARTLEDVPSEAPYGMEALLLNSS--AVFLKWKAPELKDRHGVLLNYHVIVRGIDTAHNFS 834 

I I I I: II ,' : II : I: I : :| :::: :: :| 
Db 919 ATTYETAPTSAPKDWVITRERRPRAVIVSWQPP--IEANGKIIDF-ILFYTLDKNLQLD 975 

Qy 835 R-ILTNVTI DAASKTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPII KRLDPF 893 

I: :| I : :: II I : I I hll I I I I : I 
Db 976 DWIMVTITGDRLTHEILDLNL-DTAYYFRIQARNAKGLGPLSEPITFR-TPKVEHPDRM 1032 

Qy 894 INQRDHVND VLTQP WFIILLGAIL 917 

I : I I :| :: :||| 

Db J033 ANDQGRHGDGG YWSVDTNLIDRSSLNEPPIGQMHPPHGSVTPQRNSNLLVI IWTVG AI ■ 1091 



Qy 



918 AVLMLSFGAMVF7KRKHMMMKQSALNTMRGNH - -TSDVIjKMPSIjSARNGNGYWLDSSTGG 975 

:|:: M :,l :: I : I: I I |: 



Jjb 1092 TILVWIVAV1CTRRSSAQQRKKRATHSAGKRKGSQKDLRPPDL*- 



-WIHHEEME 1143 



i 

W 976 M- -VWRPSPGGDS- -LEMQKDHIADYAPVCGAPGSPAGGGTSSGGSGGAGSGASGGDDIH 1031 

I : Ml I I: : : III: I |: II |:: 

Db 1144 MKNIEKPS-GSDTQGRDSPRQSCQDITPVSHSQSESQLGSKSTSHSG PDADEVG 1196 

t 

Qy 1032 GGHGS-ERN--QQRYVGEYSNIPTDYAEVSSFGRAPSEYGRHGNASPAPYATSS — IL 1084 

: II :l II I II =1 I I I: II 

Db 1197 SNISTLERTLAARRATRAKLMIPMD SQPSSNPPWSAIPVPTLESAQYPGIL 1248 

Qy 1085 S PHQQQQQQQPRYQQRPVP GYGLQRPMHPHYQQQQHQQQQAQQTHQ 1130 

II I:: Mil 1:1 I II II 

Db 1249 PSPTCGYPH PQFTLRPVPFPTLTVDRGFGTSRVTEVPASQQSSVLSHPQPEH- 1300 



Qy 1131 
Db 1301 



1167 

:HI I: I I: I ' 
STSEDAPSRTIPTACV 1316 



RESULT 9 
143027 

neural cell adhesion molecule Ll ■ goldfish 

N; Alternate names: E587 antigen 

C; Species: Carassius auratus (goldfish) 

C;Date: ll-Jan-2000 »sequence_revision ll-Jan-2000 ftext_change 0 

C;Accession: T43027 

R;Giordano, S.; Laessing, D.; Lottspeich, F.; Stuenter, C.A.Q. 
submitted to the EMBL Data Library, April 1996 



A; Description: Molecular cloning of goldfish E587 antigen, a cell adhesion molecule = 
A; Reference number: Z22294 
A;Accession: T43027 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A ;Molecule type: mRNA ; 
A; Residues: 1-1232 <GIO> 

A;Cross -references : EMBL : U55211 ; NID:gl305526; PID:gl305527; PIDN:AAA99159.1 

C; Super family: neural cell adhesion molecule Ll; fibronectin type III repeat homcloa-; 

C;Keywords: cell adhesion; membrane protein 



Query Match 8,2%; Score 593; DB 2; Length 1232; 

Best Local Similarity 23.24; Pred. No. 3.7e-24; 

Matches 296; Conservative 146; Mismatches 495; Indels 340; Gaps 

Qy 4 PRIIEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFL 63 

I I I: 1 I hi hi h :| lll:l I I :| :| I 

Db 15 PTITVQPVSHTAFSLDDVILACEASGDPAPSFRWVRDGREFR RELLSSGTL — 65 

Qy 64 RVIHSRRESD- -AGTYWCEAKNEFGVARSRML- - -QVAVLRDEFRLEPANTR-VAQGE 117 

: I ' hi I : I I I I I : I I I II :|: 
Db 66 -TAEDREELHPIQGSYRCYVMS-LGTAVSDLAQLITEPIPTLAKEKR---QRTRSFEEGD 12C 

Qy 118 VALMECGAPRGSPEPQISWRRNGQTLNLVGNRRIRI-VDG GNLAIQEARQSDDGRY 172 

I:: I I: I 1:1 I : :: hi: :ll II : ::|: I 
Db 121 SAVLYCNPPRSSVTPRIHW-MDMHWRHIPLNERVTTSLDGNLYFANLLVNDSRED Y 175 

Qy 173 QC — WRNVyGTRESAI AFLRVHVRP — FLIRGPQNQ TAWGSSWF 215 

I : :h :| : : I I I II I : I :: 

Db 176 TCNAHIINASVILPKER ISISVTPSNSVLRNRRPQLQKPAGSHSSYLVLRGQTLTL 231 

Qy 216 QCRIGGDPLPDVLWRRTASGGNMPLRRFSWLHSASGRVHVLE-DRSLRLDDVTLEDMGEY 274 

:| I I 1:1 I I I II : II I: I I::: h I III 

Db 232 ECI PEGLPTPEVQWERMDS — PL SPARVRWLRYRRWLQIESVSEADDGEY 279 

Qy 275 TCEADNAVGGITATGILTVHAPPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEG 334 

II I I: I : : 1 1 I I : II: I |: I Mil I I I : lh 

Db 280 TCT AQNSQGSVKHHYAVT VEAAP Y WT RRPENHLY APGET VRLDCQAEG IPTPNITWSM- - 337 

Qy 335 NSSLLLPGYRDGRMEVTLTPEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVD 394 

I : : Ml: | :|: : : I Ml:: ill 
Db 338 NGAPIAGTDPDPRRHVS- ■ -SGTLILTDVQIS- ■ -DTAVYHVEATNRHGNILINTHVHV- 390 

Qy 395 TQFELPPPIIEQGPVNQTLPVKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSD 454 

Ml I: : : M III |:| |:| I : : : || 

Db 391 "VELPPQILTEDDLRYEATEGQTVLLQCRTFGSPQPRVDWQITNSGPALANARMSQTSD 448 

Qy 455 AGALTISDLQRHEDEGLYTCVASNRNGKSS 484 

I I III: II :\\\ I I I 

Db 449 -GNLQISDVS-EEDSSMYTCSVSTSNMSISAELWLNRTKIVDPPQDLRVLRGDDAVLQC 506 

Qy 485 WSGYLRLDTPTNPNIRFFRAP ELST-- 509 

I M : I: :| hll 

Db 507 RYTVDHMLRQP^IQWKRDRHKITSSANDDRYTESPDGSLRITDVQMEDSGIYSCEISTRL 566 

Qy 510 v-YPGPPGKPQMVEKGENSVTLSWTRSNKVGGSSLVGYVIEMFGK-NET 555 

II :: III Mill : I : Ml I I 

Db 567 DSVSAIGSIWLDKPGSPHSLELSEKKERSVTLSWMPGAE-NNSPISEYVIERRERQNPG 625 

Qy 556 DG-WVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGL 614 

II ' I: I : I I I I :l I hi III II : 

Db 626 KGHWEEYRRVPQDITHLEIHLQPYSTYHFRVRGVNGIGMSEPSPPSESYST 676 

Qy 615 DLSEARASLLSGDWELSNASWDSTSMRLTWQIINGRYVEG — FYVYARQ 663 

I: : ,.:| :l II h :||l : : I : :| II 
Db 677 - -PAAKPDMNPENVTSVS — TDSNSLVITWQELEQRQFNGPGFRYRIYWRQEGDSHWM 730 

Qy 664 — LPNP-IVNNPAP VTSNTNPLLGSTSTSASAS 693 

II II I : || :' : 

Db 731 ESSASNPPFIVEGPGTFIPFQIKVQAVNELGAGPEPDAEIGYSGEDLPLEAPSSVAVSEL 790 



Qy 694 ASASALISTKP--NIAAAG" 

: I: I, : I 



■ ■ KRDGETNQSGGGAPTPLNTR YRMLT I LNGGGA 742 
:: I III II: 



Mon Jan 22 13:04:27 2001 



us-09-540-245a-16.rpr 



Page 8 



Db 791 NKTTVLVKWSPVSTKSVRGHLLGYKIHVRKKGPRAHSQRGLPMQEPAAERNRVIVANGNK 850 

Qy 743 SSCTITGLVQYTLYEFFIVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNS-" S 799 

:: I I : II 1 1 I 1 1 1 1 I 1 1 : I I : 
Db 851 EEMVLSDLHFHSNYTLTVAPFNSKGEGPHSKRRI--TLA-TPEGAPGPLSFLTFESPSET 907 

Qy 800 AVFLKWKAPELKD--RHGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEG 857 

: 1:1 II: : I II |: II : I : |: :| I :| 
Db 908 EITLFWGAPDKPNGVLTGYLLQYYEPVPGSSSTHQMKSL--NLPLDT — EYTLKDLNPQ 962 

Qy 858 VMYTVGVAAGNNAGVG - PYCVPATLRLDP 885 

: I : I III I : II 
Db 963 IQYHFSLRALTAAGHGEPVEMEGATMLDGEPPSVINITAGQTSVNISWVPGERPRSFAFS 1022 

Qy 886 — ITKRLDPFINQRDHVN 901 

: I I : : II 

Db 1023 FRYLRRSADGKWRESERVNSSQAFYQLHGLTPGFQYRLEILPGNFTREFETAGPELHELP 1082 

I 

Qy 902 -DVLTQPWFIILLGAILAVLML SFGAMVMRKHMMMKQSALNTMRGNHTS 951 

' Ml III I: I:: :|:: II II I :: :| I 

Db 108.3 SSFVTQGWFIGLISALVLLLLVLLILCYIKKSKGGKYSVKDK — EEDQVNGARTMKDG 1138 

M 952 DVLKMPSLSARNGNGYWLDSSTGGMVWRPSPGGDSLEMQKDHIADYAPVCGAPGSPAGG- 1010 
m : II : I I :| I =111 : I 

Db 1139 QFGQYKSLESDNEKCST SQQSV CESRRSSNDSLADYGDSVDIQFNEDGSF 1188 

Qy 1011 -GTSSGGSGGAGSGASG 1026 

I II I 1:11 
Db 1189 IGQYSGRRDPNGHGSSG 1205 



RESULT 10 
A53449 ! 

plasmacytoma-associated neuronal glycoprotein PANG - mouse 
C; Species: Mus musculus (house mouse) 

C;Date: 25-Aug-1995 tsequence_revision 25-Aug-1995 #text_change 24-Sep-1999 
C; Accession: A53449 

R;Connelly, M.A.; Grady, R.C.; Mushinski, J.F.; Marcu, K.B, 
Proc, Natl. Acad. Sci, U.S.A. 91, 1337-1341, 1994 
A;Title: PANG, a gene encoding a neuronal glycoprotein, is ectopically activated by inti 
A;Reference number: A53449; MUID: 94151325 
A; Accession: A53449 
A; Status: preliminary 
A;Molecule type: mRNA 
A;Residues: 1 J 1028 <CON> 

A;Cross-references; GB:L01991; NID:g200056; PIDN:AAA17403.1; PID:g200057 

C;Supar family^ contactin; fibronectin type III repeat homology; immunoglobulin homology 

C; Keywords: gfycoprote^ 



P Query Match 8.0»; Score 581; DB 2; Length 1028; 

Best Local Similarity 23.0%; Pred. No. 1.3e-23; 
Matches 252; Conservative 155; Mismatches 395; Indels 294; Gaps 40; 

Qy 2 ENPRIIEHPMDTTVP— RNDPFTFNCQAEGNPTPTIQWFRDGRELRTDTGSHRIMLPAG 58 

: I I: I :: I I Ihl 1 1 1 : 1 :| :| :: I II II 

Db 24 QGPVFIKEPSNSIFPVDSEDKRITLNCEARGNPSPHYRWQLNGSDIDTSL-DHRYRLNGG 82 

Qy 59 GLFFLKVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANT-RVAQGE 117 

I II: I I hi I I I I II I II I I : I: :| I :|: 
Db 83 NLI - ■ -VINPNRNWDTGSYQCFATNSLGTIVSREAKLQFAYL-ENFKTRMRSTVSVREGQ 138 

Qy 118 VALMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIV-DGGNLAIQEARQSDDGRYQCV 175 

:: II I I I hi : 1:1 ! : I I I I 

Db 139 GWLLCGPPPHSGELSYAWVFN-EYPSFVEEDSRRLVSQETGHLYIAKVEPSDVGNYTCV 197 

Qy 176 V-KNWGTRESATAFLKVHVRPFLIRG PQNQTAWGSSWFQCRI 219 

I I II : I ::| I : I IIM M 

Db 198 VTSTVTNTRVLGSP TPLVLRSDGVMGEYEPRIEVQFPETLPAAKGSTVRLECFA 251 

Qy 220 GGDPLPDVLWRRTASGGNMP LRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGE 273 

1:1:1 : III: II MM: : I:: : II I 

Db 252 LGNPVPQI NWRRS - - - DGMPFPNK I KLRKFNGM LEIQNFQQEDTGS 294 



Qy 274 YTCEADNAVGGITATGILTVHAPPRFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVE 333 

I |:|: \- I I II :| I :: :: : : I : :||:|:| |:|: I 
Db 295 YEGIAENSRGKNVARGRLTYYAKPYWLQLLRDVEIAVEDSLYWECRASGKPKPSYRWLKN 354 

Qy 334 GNSSLLLPGYRDGRMEVTLTPEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSV 393 

I:: : .: |::: |:| III : I I I I : I I 

Db 355 GDALVL- — -EERIQIE NGALTITNLNVTDSG-MFQCIAENRHGLIYS — SA 399 

Qy 394 DTQFELPPPIIEQGPVNQTLPVK-SIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRN 451 

: : | : I: : : I; |:|:| |: :| :|:: I : |:| I : 
Db 400 ELKWASAPDFSRNPMKRMVQVQVGSLVILDCRPRASP-RALSFWKKG'DMMVREQARVS 457 

Qy 452 LSDAGALTISDLQRHEDEGLYTCVASNRNGKSSWSGYLRLDTPT 495 

: I I I :: : I I III | |: II:: : :| : II 
Db 458 FLNDGGLKIMHVTR-ADAGTYTCTAENQFGKANGTTHLWTEPTRIILAPSNMDVAVGES 516 

Qy 495 -. NPNIRFFR 503 

i 1:1: 

Db 517 VILPCQVQHDPLLDIMFAWYFNGALTDFRRDGSHFEKVGGSSSGDLMIRNIQLKHSGKYV 576 

Qy 504 ■ -APELST- -YPGPPGRPQMVERGENSVTLSWTRSNKVGGSSLVGYVTEM 549 

I II llll :: I : : MM I :: I :: 
Db 577 CMVQTGVDSVSSAAELIVRGSPGPPENVKVDEITDTTAQLSWTEGTD-SHSPVISYAVQA 635 

Qy 550 — FGKNETDGWVAVGT — RVQNTTFTQT--GLLPGVNYFFLIRAENSHGLSLPSPMSE 601 

I : II :l : III I I I I I I I I I llll 
Db 636 RTPF — SVGWQSVRTVPEVIDGKTHTATVVELNPWVEYEFRIVASNRIGGGEPSLPSE 691 

Qy 602 PITVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMRLTWQIINGKYVEGFYVYA 661 

: ' II : :| : : : Ml 
Db 692 KVRT - — EEAAPEIAPSEV — SGGGGSRSELVITW 721 

Qy 662 RQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTRPNIAAAGKRDGETNQSGG 721 

: | I MM 

Db 722 DPVP EELQNGG 732 

Qy 722 G APIPljNTKYRMLTILNGGGASSCTI-TGLVQYTLYEFFIVPFYKSVEGRPSNS 774 

llll: I:: M :: II : : III 

Db 733 GFGYWAFRPLGVTTWIQTWTSPDNPRYVFRNESIVPFSPYEVKVGVYNNKGEGPFSPV 792 

Qy 775 RIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELRDRHGVLLNYHVIVRGIDTAHNFS 834 

: I: |: II : I IMI : : I I MM II I 
Db 793 TTVFSAEEEPTVAPSHISAHSLSSSEIEVSWNTIPWRLSNGHLLGYEVRYWNNGGEEESS 852 

Qy 835 RILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPITRRLDPFI 894 

I I : ■ : II : | I 1 : 1 1 II: : :: lh I 
Db 853 R-—KVKVAGNQTSAVLRGLRSNLAYYTAVRAYNSAGAGPF — SATVNATTRRTPP-- 903 

Qy 895 NQRDHVNDVLTQPWFIILLGAILAVLMLSF GAMVFVRRRHMMMRQSA 941 

Ml :: I :M:: I II : I: 

Db 904 SQPPGNWWNATDTKVLLNWEQVRAMENESEVTGYRVFYRTS— -SQNN 949 

Qy 942 LNTMRGNHTSDVLKMP 957 

:: : III I M 
Db 950 VHVLNTNRISAELLLP 965 



RESULT 11 
T30532 

neural cell adhesion molecule Ll homolog • Fugu rubripes 
C; Species: Fugu rubripes 

C;Date: 02-Sep-2000 fsequence_revision 02-Sep-2000 »text_change Q2-Sep-2000 
C; Accession: T30532 ; 

R;Riboldi Tunnicliffe, G.R.; Platzer, M. ; Nyakatura, G.; Elgar, G.S.; Brenner, S.; Ro 
submitted to the EMBL Data Library, September 1997 

A; Description: Analysis of the genomic loci of Fugu rubripes homologs of the human di 
A; Reference number: Z20848 
A; Access ion: T30532 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA ' 
A; Residues: 1-1277 <RIB> 

A; Cross -references: EMBL:AF026198; NID:g3098263; PlD:g3098264; PIDN:AAC15580,1 
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C; Genetics: 

A;IntronS: 42/1; 47/1; 81/2; 149/1; 190/1; 247/1; 285/2; 347/1; 391/1; 440/1; 477/2; 531 
/2 

A; Note: Ll-CAM 

Query Match * 8.04; Score 579.5; DB 2; Length 1277; 
Best Local Similarity 23.3%; Pred. No. 2,le-23; 
Matches 240; Conservative 139; Mismatches 429; Indels 221; Gaps 35; 

Qy 4 PRHEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGREIKTDIGSH-RIMLPAGGLFF 62 
Mill: !:l MM I III I : :: II I | 
Db 51 PAITTQPESVTVFSVEDLVMRCEASGNPSPTFHWTKDGEEFDPSSDPEMKVTEEAGSSVF 110 

Qy 63 LKVIHS - - RRESDAGTYWCEAKNEFGVARSRNATIflVAVLRDEFRLEPANTRYAQGEVAL 120 
: :: : I I II II I I I I I : : : : : I 
.11 YTLSNTMDTLKQYQGKYICYASNELGTAVSNAAVLMIDAPPVQQKEKKVTEKAEAGHSIA 170 



.21 MECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDD'GRYQCWKNV 179 

I I: I :| I I I : : I: : III I II::: 

,71 LSCNPPQSSMQPIIHWMDN-RLRHIRLSDRVMVGKDGNLYFANLLTEDSRNDYTCNIQYL 229 

.80 VGTRESATAFLKVHVRPFLI — RGPQ NQTAWGSSWFQCRIGGDPLPD 226 

I : : I I : III I: I :: :| : I I I 

!30 ATRTILAKEPITLTVNPSNLVPRNRRPQMMRPTGSHSTYHALRGQTLELECIVQGLPTPK 289 

!27 VLWRRTASGGNMPLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGIT 286 

Ml MM MM: ::: I III I 1:1 I 

!90 VSWLR- -KDGEMSESRIS KDMFDRRLQFTNISESDGGEYQCTAENVQGRTF 338 

!87 ATGILTVHAPPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEG— NSSLLLPGY 343 

I :M I I : I :|l h I MM I ||: hi I ::: I I 
39 HTYTVTVEASPYWTNAPVSQLYAPGETVRLDCQADGIPSPTITWTVNGVPLSATSLEP-- 396 

144 RDGRMEVTLTPEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPI 403 

:!l I : | : | | | |:: : | >| | |||| I 

197 RRSLTESGSLILKDVIFG— DTAIYQCQASNKHGTILANTNVYV—IELPPQI 445 

04 I EQG PVNQTLPVKS I WLPCRTLGT PVPQVS WYLDG I PIDVQEHERRNLSDAGALT I SDL 463 

: I :| I I hi IMI I : : : I II II |::: 
46 LTENGNTYTFVEGQKALLECETFGSPKPKVTWESSSISLLIAD-PRVNLLTNGGLEIANV 504 

64 QRHEDEGLYTCVASNRN GRSSWSGYLRLDTP--T 495 

1:111:111: I II:: Ml: 

i05 S-HDDEGIYTCLVQGSNISVNAEVEVLNRTVILSPPQALRLQPGKTAIFTCLYVTDPKLS 563 

:96 NPNIKF FRAPEL STY- 510 

M:: III I 

i64 S PLLQWRKNDQKI FESHSDRKYT FDG PGL 1 1 SHVEPGDEGVYTCQ I ITKLDMVEASSTLT 623 

11 — PGPPGKPQMVEKGENSVTLSWTRSNKVGGSSLVGYVIEMFGKN-ETDGWVAVGTRV 565 

I II I: 111:11 : I :: l|:| :: : =11 : 
24 LCDRPDPPVHLQVTNAKHRWTLNWTPGDD-NNSPILEYWEFEDQDMKENGWEELKRVA 682 

66 QNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITYGTRYFNSGLDLSEARASLLS 625 

: I I ::M : II I I IMI II : I 

83 ADKKHVNLPLWPYMS YRFRVI AI NDQGKSDPSKLS DLYKTPADAPD 728 

26 GDWELSHASWDSTSMKLTWQII NG — KY 653 

: :: M I :: :||: : II II 
29 SNPEDVRSES ■ TDPDTLVITWEEMDKRNFNGPDFKYLVMWRRVVGSGPDWHEEYT IAPPF 787 

54 — VEGFYVYARQL PNPIV NNPAPVTSNTNPLLGSTST 688 

IM : :: |:||: M III :::| 

88 IVTDVQNFSAFEIKVQAVNKKGLGPEPDPIIGYSGEDVPLEAPLNLGVLLENSTTIRVTW 847 



Qy 689 -SASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTI 747 

: I: I : I: I I I :: : I :| 
Db 848 SAVDKETVRGHLLGYKIYLTWGHHRNSR AQEPEN IVMVQTGANEEKKSI 896 

Qy 748 TGLVQYTLYEFFIVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKA 807 

I I I I: I I II I I I II I I: : I M I 
Db 897 TNLRPYCHYDLAISAFNSKGEGPLSEKTSFMTPEGVPG-PPMSMQMTSPSESEITLHWTP 955 



Qy 808 PELKDRHGVLLNYHVIVRG IDT AHNFSRI LT NVT IDAAS PT - - -LVLANLTEGVMYTVGV 864 

I MM I : M : I :: : ! ill I I I I : 
Db 956 P- -SKPNGILLGYSLQYRKMQSDDNPLQV VDIASPEITHLTLKGLDRHSHYQFLL 1008 

Qy 865 AAGNNAGVG 873 

I II I 
Db 1009 MARTAAGKG 1017 



RESULT 12 
S22383 

axonin 1 precursor -: chicken 

N; Alternate names: neural cell adhesion molecule AxCAM 

C;Species: Gallus gallus (chicken) 

CjDate: 30-Sep-1993 lsequence_revision 30-Sep-1993 ftext_change 21-Jan-2000 
C;Accession; S22383; S34107; S69332; S22128 

R;Zuellig, R.A.; Rader, C; Schroeder, A.; Kalousek, M.B.; von Bohlen und Halbach, F. 
Eur. J. Biochem. 204-, 453-463, 1992 

A; Title: The axonally secreted cell adhesion molecule, axonin-1. Primary structure, i 
A;Reference number: S22383; MUID: 92174898 
A;Accession: S22383 • 
A; Molecule type: mRNA 
A; Residues: 1-1036 <ZCEl> 

A; Cross -references: EMBL:X63101; NID:g62852; PIDN:CAA44815.1; PID:g62853 
A; Accession: S34107 '. 
A; Molecule type: protein 

A;Residues: 29-49 ; 51-80; 84 -95 ; 100-117 ; 120-128 ; 130 -141 ; 143 -176 ; 243 -254 ; 256 - 296; 3 03 - 336 
R;Giger, R.J.; Vogt, ; L.; Zuellig, R.A.; Rader, C; Henehan-Beatty, A.; Wolfer, D.P.; 
Eur. J. Biochem. 227, 617-628, 1995 

A;Title: The gene of chicken axonin-1. Complete structure and analysis of the promote 
A;Reference number: S69332; MUID:95172044 
A; Accession: S69332 ,■ 

A; Status : preliminary; nucleic acid sequence not shown; translation not shown 

A; Molecule type: DNA 

A; Residues: 1002-1036 <GIG> 

A; Cross -references: EMBL:X79607 

A;Note: the nucleotide sequence was submitted to the EMBL Data Library, June 19S4 
C; Super family: contactin; fibronectin type III repeat homology; immunoglobulin hom:,^ 
C; Keywords: cell adhesion 

F;l-23/Domain: signal sequence tstatus predicted <SIG> 
F;24-1036/Product: axonin 1 t status predicted <MAT> 
F;336-392/Domain: intnunoglobulin homology <IMM> 



Query Match 7.8%; Score 568; DB 2; Length 1036; 

Best Local Similarity 23.1%; Pred. No. 6.4e-23; 

Matches 250; Conservative 116; Mismatches 352; Indels 364; Gs?e 

Qy 4 PRIIEHPMDTTVPK- ■ -NDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGL 60 

I I I M: : I 1:1 II I :l :| III I I I II I 
Db 32 PVFEEQPAHTLFPEGSAEEKVTLTCRARANPPATYRWKMNGTELKMGPDS-RYRLVAGDL 90 

Qy 61 FFLKVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLE 107 

: :: i MM MM II IM |: II I 
Db 91 VISNPVKAK - - - DAGS YQCVATNARGT WSREAS LRFGFLQ - E FS AEERDPVK I T EGWGV 146 

Qy 108 - : - 107 

Db 147 MFTCSPPPHYPALSYRWLLNEFPNFIPADGRRFVSQTTGNLYIAKTEASDLGNYSCFATS 206 

Qy 108 ' PANTRVAQGEVALMECGAPRGSPEPQ 133 

11:1 I:: M I |:| II 
Db 207 HIDFITKSVFSKFSQLSLAAEDARQYAPSIKAKFPADTYALTGQMVTLECFA-FGNPVPQ 265 

Qy 134 ISWRK-NGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWKNWGTRESATAFLK 191 

' I III I'l : :: : Ml M \-\ M I I:: : 
Db 266 IKWRKLDGSQTSKWLSSEPL LHIQNVDFEDEGTYECEAENIKG-RDTYQGRII 317 

Qy 192 VHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNMPLRKFSWLHSASG 251 

:MI : ; : Mil: M I I I I I I II :: 
Db 318 IHAQPDWLDVITDTEADIGSDLRWSCVASGKPRPAVRWLR— -DGQPL ASQN 366 



Mon Jan 22 13:04:27 2001 



us-09-540-245a-16.rpr 



Page 10 



Qy 252 RVHVLEDRSLKLDDVTLEDMGEYTCEADHAVGGITATGILTVHA-PPRFVIRPKNQLVEI 310 

1:1 I: : III I I I 1:1 I : I: 111 I I I : I :|: 
Db 367 RIEV-SGGELRFSKLVLEDSGMYQCVAENKHGTVyASAELTVQALAPDFRLNPVKRLIPA 425 

Qy 311 --GDEVLFECQAHGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTPEGRSVLSIARFARE 368 

:|: II I: |: |: : II I ||:| :| :| :: 
Db 426 ARSGKVIIPCQPRAAPKATVLWT- -KGTELLTNSSR VTITADGTLILQ- -NISKS 476 

Qy 369 DSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTLPVKSIWLPCRTLGT 428 

III MIU : ::M : I I : : I : I I 
Db 477 DEGK-YTCFAENFMGRANSTGILSVRDATK ITLAPSSADINVGENLTLQCHASHD 530 

Qy 429 PVPQV--SWYLDGIPIDVQEHE RRNLSDA-GALTISDLQ-RHEDEGLYTCVASNRN 480 

, I : :| II III: : I I :: :| I I I : I :| Mil I 

J.Db 531 PTMDLTFTWSLDDFPIDLDKSEGHYRRASVKEAVGDLAIVNAQLKH - - SGRYTCTAQTW 588 
• 

Qy 481 GRSSWSGYLRLQJPTNPNIKFFRAPELSTYPGPPGRPQMVERGENSVTLSWTRSNRVGGS 540 

:| I I : I Hill 1 1 1 : 1 I 

Db 589 DSTSESATLTVRGP PGPPGGVWRDIGDTTVQLSWSRGFD -NHS 631 

I QF 541 SLVGYVIE MFGKNETDGWVAVGTRVQN TTFTQTGILPGVNYFFLIRAENSH 591 
{ : I II : I I : I I I hi ::| I : I I 
Jh 632 PIARYSIEARTLLSNK WKQMRTNPVNIEGNAETAQWNLIPWMDYEFRVLASNIL 686 

Qy 592 GLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGDWELSNASVVDSTSMKLTWQIING 651 

I: II 

Db 687 GVGEPS 692 

Qy 652 KYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTK- ■ -PNIAA 708 

II I: I II I :l 

Db 693 LP SSKIRTREAAPTVAP 709 

Qy 709 AGKRDGETNQSGGGAP TPLNTRYRMLTILNGGG ASSCTIIGLV 751 

:| lllll III: III- : :| I 

Db 710 SGL GGGGGAPNELIINWTPTLRDYQ NGDGFGY I LSFRKKGTQGWLT ARV 758 

Qy 752 QYTLYEFFIVPFYRSVEGRPSNSRIARTLEDVPSEAPYGMEAIL 795 

11:1 I : : II I : I : I: I II: : I 
Db 759 PHAESLHYVYRNESIGPYTPFEVKIKAYNRKGEGPESLTAIVYSAEEEPKVAPFRVTAKA 818 

Qy 796 LNSSAVFLKWKAPELKDRHGVLLNYHV-IVRGIDTAHNFSRILTNVTIDAASPTLVLANL 854 

: II : : I: I I INI I : : I |: I : :| I I 
Db 819 VLSSEMDVSWEPVEQGDMTGVLLGYEIRYWKDGDKEEAADRVRTAGLVTSAHVT — GL 874 

Qy 855 TEGVMYTVGVAAGNNAGVGPYC VPATLRLDPITKRLDPFINQ 896 

I I I I I II II : II :| : II : I 

Db 875 NPNTKYHVSVRAYNRAGAGPPSPSTNITTTKPPPRRPPGNISWTLTGSTVTIKWDPWAQ 934 



Qy 897 RD 898 
m 935 AD 936 



RESOLT 13 
A43425 

Bravo/Nr-CAM cell adhesion molecule Ll homolog ■ chicken (fragment) 
C;Species: Gallus gallus (chicken) 

C;Date: 27-Apr-1993 *sequence_revision 18-Nov-1994 §text_change 16-JU1-1999 ' 
C;Accession: A43425 

R;Kayyem, J.F.; Roman, J.M.; de la Rosa, E.J.; Schwarz, 0\; Dreyer, W.J. 
J. Cell Biol. 118, 1259-1270, 1992 
A; Title: Bravo/Nr-CAM is closely related to the cell adhesion molecules Ll and Ng-CAM ar 
A;Reference number: A43425; MOID: 92381110 
A; Accession: A43425 

A; Status: preliminary; not compared with conceptual translation 

A;Molecule type: nucleic acid; protein 

A; Residues: 1-1259 <KAY> 

A; Experimental source: cerebellum 

A;Note: sequence extracted from NCBI backbone (NCBIP: 112026) 

C;Superfamily: neural cell adhesion molecule Ll; fibronectin type III repeat homology; 

F;237-294/Domain: immunoglobulin homology <IMM> 



Query Match < 7.8%; Score 566; DB 2; Length 1259; 

Best Local Similarity 21,0%; Pred. No. l.le-22; 

Matches 279; Conservative 180; Mismatches 514; Indels 354; Gaps 51; 

Qy 8 EHPMDTTVPKNDPFTFNCQAEGNPTPIIQWFKDGRELRTDTGSHRIMLPAGGLFFLKVIH 67 

: I I I ?: hl:l I I: I ::| I : I I I : ::: 
Db 22 QSPRDYIVDPRENIVIQCEARGRPPPSFSWTRNGTHFDIDKDAQVTMRPNSGTLWNIMN 81 



68 S-RRESDAGTYWCEARNEFGVARSRNATL--QVAVLRDEFRLEPANTRVAQGEVALMECG 124 

: I: I ! I Mill III: : I : :||| : I :|: :: I 
82 GVRAEAYEGVYQCTARNEFGAAISNNIVIXXSXSPLWTRERLEPNHVR--EGDSLVLNCR 139 

125 APRGSPEPQISWRKNGQTLNLVGNRRIRIVDGGNLAIQEARQSDDGR- -YQCWK- -NW 180 

I I I I ; | | ::|: |:| I :| I I I : : 
140 PPVGLPPPI IFWMDNA- FQRLPQSERVSQGLNGDLY FSNV - Q PEDTREDY ICY ARFNETI 197 

181 GIRESATAFLKV-HVRPFLIRGP QNQTAWGSSWFQCRIGGDPLPDVLWR 230 

:: :H :| I I I : : I : : : : I I I I : I 

198 QQRQRQPISVKVFSTRPVTERPPVLLTPMGSTSNKVELRGNVLLLECIAAGLPTPVIRWI 257 

231 RTASGGNMPLRRFSWLHSASGRVHVLEDRSLRLDDVTLEDMGEYTCEADNAVGGITATGI 290 

: II :l : : ::||: II: I I I I II :| 

258 R-EGGELPANRTFFENF RRTLRIIDVSEADSGNYRCTARNTLGSTHHVIS 306 

291 LTVHAPPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEV 350 

:H I I ::. hi :: h= hllhhh: II : I I :| 
307 VTVKAAPYWITAPRNLVLSPGEDGTLICRANGNPRPSISWLTNGVPIAIAP-EDPSRRV 364 

351 TLTPEGRSVLSIARFAREDSGRWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVN 410 

:| :::' I ■ : I I I 1 1 1 I I : : hi : III: II 
365 — -DGDTIIFSA--VQERSSAVYQCNASNEYGYLLANAFVNVLAE---PPRILT-PAN 413 

411 QTLPV- -RSIWLPCRILGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHED 468 

: I .! '■:: I hi |:: h h ': Mill:: 
414 RLYQVIADSPALIDCAYFGSPRPEIEWF-RGVRGSILRGNEYVFHDNGTLEIPVAQR-DS 471 

469 EGLYTCVASNPJWRSSWSGYLRLDTPT — NPNIRFFRA PEL-— 507 

I lllll h lh I : II II: II 
472 TGTYTCVARNKLGRTQNEVQLEVRDPTMIIRQPQYRVIQRSAQASFECVIRHDPTLIPTV 531 



532 IWLRDNNELPDDERFLVGRDNLTIMNVTDRDDGTYTCIVNTTLDSVSASAVLTWAAPPT 591 

511 PGPPGRPQMVEKGENSVTLSWTRSNRVGGSSLVGYVIEM-FGRNETDGWVAVG 562 

I II :: : II: III : I : :lll I :| I 
592 PAIIYARPNEPLDIiELTGQLERSIELSWVPGEE-NNSPIINFVIEYEDGLHEPGVWHYQT 650 

563 IRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARAS 622 

: I T I I III I : I I I III II :l : I 
651 EVPGSQTTVQLKLSPYVNYSFRVIAVNEIGRSQPSEPSE QYLTKSANPDE — 700 

623 LLSGDWELSNASWDS - -TSMKLTWQ I INGRYVEG — FYVYARQLPNP IVNNPAPVT 676 

II : I :: :lh : I I : I II I 
701 NPSNVQGIGSEPDNLVITWESLRGFQSNGPGLQYRVSWRQRDVDDEWTSVWA 753 

677 SNTNPLLGSTSTSASASASASAL- - ISTKPNIAAAGRRDGE TN 717 

: : :: II II : I : II I 

754 NVSRYIVSGTPTFVPYEIRVQALNDLGYAPEPSEVIGHSGEDLPMVAPGNVQVHVINSTL 813 

718 QSGGGAPTPL- NTKYRMLTILNGGGASSCTITGLV 751 

I Ik : : ::|| I : : II 

814 AKVHWDPVPLKSVRGHLQGYRVYYWKVQSLSRRSRRHVEKKILTF--RGNRTFGMLPGLE 871 

752 QYTLYEFFIVPFYRSVEGRPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELK 811 

I: h : II I :: :| I III I :: :: hi :| 

872 PYSSYRLNVRVVNGRGEGPASPDKVFKTPEGVPS-PPSFLRIINPTLDSLTLEWGSP-I 928 

812 DRHGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMY TVG" 863 

:M :l : : h I I : I I :hl II I ill 
929 HPNGVLTSYILKFQPINNTHELGP-LVEIRIPANESSLILNNLNYSTRYKFYNAQTSVGS 987 

864 VAAGNNAGVGPYCVPATLRLDPITKRL DPFIN QRDHVN 901 
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I : II: II:: I: h : : I III 

Db 988 GSQITEEAVTIMDEAGILRPAVGAG-KVQPLYPRIRNVTTAAAETYAHISWEYEGPDHAN 1046 



Qy 902 ■ 



Qy 



955 RMPSLSAROG1 

1 "i ; : i 

Db 1148 KYPVKEKEDAHA* 



Db 1047 FYVEYGVAGSREDWRREIVNGSRSFFVLRGLTPGTAYRVRVGAEGLSGFRSSEDLFETGP 1106 

Qy 902 DVLTOI ifFIILLGAILAVLMLSFGAMVFVKRKHMMMKQSALNTMRGNHTSDVL 954 

I: II III I: I: |:|:| : |::| :| 

Db 1107 AMASRQVDIATQfflfFIGLMCAV-ALLILILLIVCPIRRN RGG 1147 



jDSSTGGMVWRPSPGGDSLEMQRDHIADYAPVCGAPGSPAGGGTSS 1014 
II I: :: I I: |: :|: 
■DPEIQPMKEDDGTFGEYRSLESD-AEDHRPLRKGSRTPSDRTVRK 1203 



Qy 1015 GGSGGA — GSdASGGDDIHGGHGSERKQQRYVGEYSNIPTDYAEVSSFGKAPSEYGRH 1070 
I : I (:l : I i = 1 : 1 1 I hi I 

^b 1204 EDSDDSLVDYGEGJNGQFNEDGS FIGQYS GKKEKEPAE'GNE 1244 



Py 1071 GNASPAP 1077 
: :|:| 

Db 1245 SSEAPSP 1251 



RESULT 14 
S26180 

neurofascin - chicken 
C; Species: Gallus gall* (chicken) 

C;Date: 13-Jan-1995 #s> iuence_revision 13- Jan-1995 ttext_change 21-Jan-2000 



C; Accession: S26180 
R;Volkmer, H,; Hassel, 
J. Cell Biol. 118, 149; 



!.; Wolff, J.M.; Frank, R.; Rathjen, F.G. 
.61, 1992 

A; Title: Structure of fte axonal surface recognition molecule neurofascin and its relatl 
A;Reference number: S2fl80; MUID: 92317154 
A; Accession: S26180 
A; Status: preliminary i>, 
A; Molecule type: mRNA 
A; Residues ; 1-1272 <VOB> 

A; Cross -references: EM&:X65224; NID:g63659; PIDN:CAA46330.1; PID:g63660 
C;Superfamily: neural dell adhesion molecule LI; fibronectin type III repeat homology] 
F;279-336/Domain: immun'oglobulin homology <IMM> 
t 

i 

Query Match j 7.81; Score 565.5; DB 2; Length 1272; 

Best Local Sintilarit; j 21.11; Pred. No. l,2e-22; 

Matches 293; Conse: -stive 160; Mismatches 455; Indels 479; Gaps 55; 



7 IEHPMDTTV-- 

II 1:1: 



■ -PKNDPFTFNCQAEGNPTPTIQWFKDGR- - 

I::: I l:|:lll II I "I: 



42 



26 IEVPLDSNIQSE; 'QPPTITRQSVKDYIVDPRDNIF-IECEAKGNPVPTFSWTRNGKFFN 8 



Qy 43 ELKT^SHRIMLPAGGLFFLKVIHSRRESDAGTYWCEAKNEFGVARSRNATL 95 

:: :|: I II hill hh:| I I I 

Db 85 VAKDPKVSMRRRSGTLVIDFHGGG RPDDYEGEYQCFARNDYGTALSSKIHL 135 

Qy 96 QVAVLRDEFRLEPAN TRVAQGEVALMECGAPRGSPEPQISWRRHGQTLNLVGNRRI 151 

II: II I :| ::| I I ! I I I : : :||: 

Db 136 QVS — RSPLWPKEKVDVIEVDEGAPLSLQCNPPPGLPPPVIFWMSSSME-PIHQDKRV 190 

Qy 152 RIVDGG NiAIQEARQSDDGRYQCWK NWGTRESATAFLK 191 

I I: :|:l hi II: I : |: 

Db 191 SQGQNGDLYFSNVMLQDA-QTD— YSCNARFBFTHTIQQKNPYTLKVKTRKPHNETSLR 246 

Qy 192 VHVRPFLIRG PQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGN 237 

I : II :| : I :: :| M ||::| : I 

Db 247 NHTDMYSARGVTETTPSFMYPYGTSSSQMVLRGVDLLLECIASGVPAPDIMWYK-KGGE 304 

Qy 238 MPLRRFSWLHSASGRVHVLE--DRSLRLDDVTLEDMGEYTCEADNAVGGITATGILTVHA 295 

:| I II :::!:: : 1 : 1 1 1 1 1 I I I : I I I : I I 

Db 305 LPAGRTK LENFNKALRI S NVS EEDSGE YPC LAS NKMGS IRHT I SVEVKA 353 

Qy 296 PPRFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTPE 355 
I :: hi :: h: hllhhh: III :| : 



Db 354 APYWLDEPQMILAPGEDGRLVCRANGNPRPSIQWLVNGEP IEGSPPNP 402 

Qy 356 GRSVLS— IARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQT 412 

I I :. : I I III I I : : III ::|l h I II 
Db 403 SREVAGDTIVFRDTQIGSSAVYQCNASNEHGYLLANAFVSV" ■ -LDVPPRIL- -APRNQL 457 

Qy 413 LPV-KSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDA GALTISD 462 

: I : I' I hhl : h :| : h I hi :| 

Db 458 IRVIQYNRTRLDCPFFGSPIPTLRWFKNG QGNMLDGGNYKAHENGSLEMS- 507 

Qy 463 LQRHEDEGLYTGVASNRNGKSSWSGYLRLDTPTNPNIKFFRAPE 506 

: I Ihhllllhl II 1:11 : I II 
Db 508 MARKEDQGIYTCVATNILGKVEAQVRLEVRDPI— -RIVRGPEDQWKRGSMPRLHCRV 563 



Qy 507 ■ 



--LSTY-- 

I: I 



Db 564 KHDPTLKLTVTWLRDDAPLYIGNRMKREDDGLTIYGVAERDQGDYTCVASTELDKDSARA 623 

Qy 511 -' PGPPGKPQMVERGENSVTLSWTRSNKVGGSSLVGYVIEMFG 551 

II :: : I II |:l : I : |::: 
Db 624 YLTVLAIPANRLRDLPRERPDRPRDLELSDLAERSVKLTWIPGDD-NNSPITDYIVQFEE 682 

Qy 552 KNETDG-WVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYF 61C 

I I I I 111111:111 llll II II 
Db 683 DRFQPGTWBNBSRYPGNVNSALLSLSPYVNYQFRVIAVNDVGSSLPSMPSE RYQ 736 

Qy 611 NSGLDLSEARASLLSGDWELSNASWDSTSMRLTWQIINGRYVEG- - -FYV- ■ YARQLP 665 

II II : I : :h:|| :| I h : h I 
Db 737 TSG ARPEINPTGV QGAGTQRNNMEITWTPLNATQAYGPNLRYIVRWRRRDP 787 

Qy 666 NPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTRPNIAAAGRRDGETNQSG 720 

I I 'I I II I: : : : I : : II 

Db 788 RGSWYHETVRAPRBWWNT - P I Y VPYEIRVQAENDFGRAPEPETYIGYSG 836 

Qy 721 GGAPTPLNTRY'RMLTILNG 739 

I I I : :M 

Db 837 EDYPRAAPTDVR-IRVLNSTAIALTWTRVHLDTIQGQLKEYRAYFWRDSSLLKNLWVSRK 895 

Qy 740 GGASSCTITGLVQYTLYEFFIVPFYRSVEGRPSNSRIARTLEDVPSEAPYGME 792 

I : :: I I: I: :| :| | : | Mil I : 
Db 896 RQYVSFPGDRNRGIVSRLFPYSNYRLEMWTNGRGDGPRSEVKEFPTPEGVPSSPRY-LR 954 

Qy 793 ALLLNSSAVFLKWKAPELRDRHGVLLNYHVIVR GIDTAHNFSRILTNVTIDAA 845 

I :: hi II :IM h: : I III I h 
Db 955 IRQPNLESINLEWDHPE--HPNGVLTGYNLRYQAFNGSKTGRTLVENFSPNQTRFTVQRT 1012 

Qy 846 SPTLVLANLTEGVMYTVGVAAGNNAGVGPYCV PATLRLDPITKRLDP 892 

I '1:1 III II: I I I I I 

Db 1013 DPI --SRYRFFLRARTQVGDGEVIVEESPALLNEATPTPASTWLPPPTTELTP 1063 

Qy 893 • 892 

Db 1064 AATIATTTTTATPTTETPPTEIPTTAIPTTTTTTTATAASTVASTTTTAERAAAATTKQE 1123 

Qy 893 FINQRDHVNDVLTQPWFIILLGAILAVLMLSFGAMVFVRR RHMMM--RQSALNTMR 946 

::N h II III h II hhl : hll h : I II 
Db 1124 LAYTKNHV-DIATQGWFIGLMCAI-ALLVLILLIVCFIRRSRGGRYPVRDNKDEHLNPED 1181 

Qy 947 GNHTSDVLRMPSLSARNGNGYWLDSSTGGMVWRPSPGGD S LEMQK - - DH I AD YAP 999 

I ill : I :| I ::: h I : II 

Db 1182 KNVEDGSFDYRS LES DEDN RPLPNSQTSLDGTIRQQESDDSLVDY- • 1226 

Qy 1000 VCGAPGSPAGGGTSSGGSGGAGSGASGGDDIHGGHGSERNQQRYVGEYSNIPTDYAEVSS 1059 

: llll I ::hh : I I 

Db 1227 : GEGGEGQFNEDGS FIGQYT-VRRDREETE- 1254 



1060 FGRAPSE 1066 
I II : 
1255 -GNESSE 1 1250 



RESULT 15 
150600 
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■ chicken (fragment) 
C; Species: Gallus gallus (chicken) 

C;Date: 13-Sep-1996 tset|uence_revision 13-Sep-1996 ttext_change 21-Jul-2000 
CjAccession: 150600 v 

R;Vielmetter, J.; Kayyeji, J.F.; Roman, J.M.; Dreyer, W.J. 
J. Cell Biol. 127, 2009$2020, 1994 

A;Title: Neogenin, an a|ian cell surface protein expressed during terminal neuronal diff 
"5193; MUID:95105243 



A; Reference number: A55|!9 
A; Accession: 150600 
A; Status: preliminary; 
A;Molecule type: mRNA 
A; Residues: 1-1443 <VIE 
A;Cross-references: EMI :O07644; NID:g641965; PIDN:AAC59662.1; PID:g641966 



iranslated from GB/EMBL/DDBJ 



Query Match 
Best Local Similarity 
Matches 225; Consei itive 



Qy 

Db 

Qy 

Db 



t 

Db 

Qy 
Db 



III 



7.7%; Score 560; DB 2; Length 1443; 
23.7%; Pred. NO. 2.7e-22; 

91; Mismatches 321; Indels 314; Gaps 30; 



Qy 10 PMDTTVPKNDPF1 MCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFLKVIHSR 69 



I hi III I 



Mill |:||: 



25 PMDILSVRGASVI NCSSYCETPPKIEWKKDGTLLNLVSDDRRQLLPDGSLLINSWHSK 8 



! - FGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVALMECGAPR 127 
I I I I I I I II I I III I I :| : I :| I:: 
85 HNKPDEGYYQCVJ^VESLGSIVSRTAKLTVAGL-PRFTSQPELSSVYKGNSAILNCEV-N 142 

128 GSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWKNWGTRESAT 187 

I : I I 1:1 : I: : II I :| I |:||::: : I 
143 VDLAPFVRWEQDRQPLSL—DDRVFRLPSGALLIGNATDTDGGFYRCVIESGGTPRYSEE 200 

188 AFLKVHVRP FLIRGPQNQTAWGSSVVFQCRIGGDPLPDVLWRRTASGGNMPLR 241 

I II:' I :! I : I I UN I II I I I I I :l : 

201 AELKILPDPEEPQSLVFVRQPSSLTKVTGQNAVFPCVAGGFPTPrVRW-TKNGEEL— 255 

242 KFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPRFVI 301 

:| I I : II : III IhNll III I I I I Ihh 
256 — ITEDSERFALRAGGSLLISDVTEEDVGTYTCIADNENETIEAQAELAVQVPPEFLK 311 

302 RPKNQivEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTPEGRSVLS 361 

II I | :::,lll: I I II: I I: :::| 

312 RPANIYAHESMDI|FECEVTGKPTPTVKWVKNGD"VVIP 349 

362 IARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTLPVKSIWL 421 

I 1:1 : I :l I I 
350 SDYFKIVKEHNLOVLGLVKS 369 

422 PCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLY1CVASNRNG 481 

III I 1:1 I I 

370 DEGFYQCIAENDVG 383 

482 KSSWSG * • * YLRLDT • • PTNPNIKFFRAPELST YP • - • GP- PGKPQMVEK ■ • ■ GENSVTL 529 

: III III I MM:! : I 

384 NAQAGAQLIILDLDVAIPTLPPTSLTSATNDHLAPAITGPLPTAPRDWAILVSTRFIRL 443 

530 SW-TRSNKVGGSSL- - -VGYVIEMFGKNETDGWVAVGTRVQNTT - - -FTQT- - -GLLPGV 579 

:| I : 1:1 : I I : Ihlh II hi 

«444 TWRT PVSD PQGDNLT YS I FYTKEG I NRE RVENTSRPGETQVMIQNLMPET 493 

580 NYFFLIRAENSrfGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGDWELSNASWDS 639 

I I : hi II I I: I I: 
494 VYVFRWAQNKHG • • • HGESSAPLKVATQ 519 

640 TSMKLTWQIINGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASAL 699 

I I I I 

520 PEVQLPGPA 528 

700 ISTKPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGG 741 

III I 1:11 : : I hi I 

529 PNIRAY AGSPTSVT VTWE - - TPLSGNGE IQNYKLYYMEKGQDSEQ 571 

742 ASSCTITGLVQYTLYEFFIVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLL 796 

I Mill :|| I I :| : I I : : III III! II : 



Db 572 DVDVAGLSYTITGLRRYTEYSFRWAYNRHGPGVSTQDWVRTLSDVPSAAPQNLTLEAR 631 

Qy 797 NSSAVFLKWKAPELRDRHGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPIL--VLANL 854 

II :: I 1:1 I : I : I : "II I :: I 

Db 632 NSRSIMLHWQPPPAGTHSGQITGYRIRYRKVSRK SDVTESVGGTQLFQLIEGL 684 

Qy 855 TEGVMYTVGVAAGNNAGVGPYC VPATLRLDPI 886 

I I :H I II ll::| : |: 

Db 685 ERGTEYNFRIAAMTV1IGTGPATDWVSAETFESDLDESRVPEVPSSLHVRPL 735 



Search completed: January 22, 2001, 12:25:18 
Job time: 1995 sec : 



Mon Jan 22 13:04:27 2001 us-09-540-245a-16.rpr Page 13 



Mon Jan 22 13:04:28 2001 



us-09-540-245a-16,rsp 



Page 1 



GenCore version 4,5 
Copyright (c) 1993 • 2000 Compugen Ltd. 

OM protein ■ protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



prched: 



January 22, 2001, 12:27:18 ; Search time 162.41 Seconds 
(without alignments) 
274.602 Million cell updates/sec 

US-09-540-245A-16 
7272 

1 GENPRIIEHPMDTTVPKNDP RSLLSNSGSGTSSQPAGHNV 1381 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 
88757 seqs, 32294092 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0* 
Maximum Match 100% 
Listing first 45 summaries 



88757 



Database : 



SwissProt _39 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution, 



suit 
NO. 


% 

Query 

Score Match Length D 


SUMMARIES 

3 ID 


Description 


RT 
RL 
CC 
CC 


1 


597.5 


8.2 


1493 


NEOlJJOUSE 


P97798 mus musculu 


CC 
CC 


2 


589.5 


8.1 


1377 


NE01JAT 


P97603 rattus norv 


CC 


3 


575,5 


7,9 


1461 


NEOlJUMAN 


Q92859 homo sapien 


CC 


4 


568 


7,8 


1036 


AXOl.CHICK 


P28685 gallus gall 


CC 


5 


560 


7,7 


1443 


NEOl.CHICK 


Q90610 gallus gall 


CC 


6 


555.5 


7.6 


1284 


NRCA.CBICK 


P35331 gallus gall 


CC 


; 


552.5 


7.6 


1447 


DCCJiOUSE 


P702U mus musculu 


CC 




551,5 


7.6 


1447 


DCCJOMAN 


P43146 homo sapien 


CC 


9 


544.5 


7.5 


1040 


AX01JUMAN 


Q02246 homo sapien 


CC 


10 


540,5 


7,4 


1040 


AX01_RAT 


P22063 rattus norv 


CC 


11 


538 


7,4 


2012 


DSCAJUMAN 


060469 homo sapien 


CC 


12 


536.5 


7,4 


1018 


CONTJUMAN 


Q12860 homo sapien 


CC 


13 


534 


7.3 


1239 


NRG_DROME 


P20241 drosophila 


CC 


14 


531,5 


7.3 


1020 


CONTJOUSE 


P12960 mus musculu 


CC 


15 


521 


7.2 


2029 


LAR.DROME 


P16621 drosophila 


CC 


16 


517,5 


7.1 


1010 


CONT.CMCK 


P14781 gallus gall 


CC 


17 


511.5 


7.0 


1260 


CAMLJtOUSE 


P11627 mus musculu 


CC 


1! 


509 


7.0 


1257 


CAMLJUMAN 


P32004 homo sapien 


CC 




497.5 


6.8 


1259 


CAMLJAT 


Q05695 rattus norv 


CC 


20 


495.5 


6.8* 


1897 


PTPFJUMAN 


P10586 homo sapien 


CC 


21 


487 




1266 


NGCA_CHICK 


Q03696 gallus gall 


CC 


22 


485,5 




1912 


PTPDJUMAN 


P23468 homo sapien 


CC 


23 


457.5 




1913 


KMLSJUMAN 


Q15746 homo sapien 


CC 


24 


441 




3707 


PGBMJiOUSE 


QQ5793 mus musculu 


CC 


25 


427.5 




4393 


PGBMJUMAN 


P98160 homo sapien 


CC 


26 


412.5 




1070 


PTK7JUMAN 


Q13308 homo sapien 


CC 


27 


406 




1091 


NCAl.CHICK 


P13590 gallus gall 


DR 


28 


403.5 




761 


NCA2JUMAN 


P13592 homo sapien 


DR 


29 


398.5 




837 


NCM2JUMAN 


015394 homo sapien 


DR 


30 


396.5 




853 


NCA1.BOVIN 


P31836 bos taurus 


DR 


31 


389.5 




837 


NCM2JOOSE 


035136 mus musculu 


DR 


32 


389.5 




1115 


NCAlJOUSE 


P13595 mus musculu 


DR 


33 


389 




1088 


NCA1JENLA 


P16170 xenopus lae 


DR 



34 


388,5 


' 5.3 


725 


1 


NCA2JOUSE 


P13594 


mus musculu 


35 


388,5 


5,3 


848 


1 


NCAlJUMAN 


P13591 


homo sapien 


36 


388,5 


5,3 


1092 


1 


NCA2JENLA 


P36335 


xenopus lae 


37 


385.5 


5.3 


858 








rattus norv 


38 


380 


5^2 


1906 


1 


KMLS.CHICK 


P11799 


gallus gall 


39 


368 


5.1 


1051 


1 


PTK7_CHICR 


Q91048 


gallus gall 


40 


356 


4.9 


1666 


1 


SKLMJiOUSE 


Q62234 


mus musculu 


41 


355.5 


4.9 


2481 


1 


ON52_CAEEL 


Q06561 


caenorhabdi 


42 


346 


4.8 


811 


1 


FS22.DROME 


P34083 


drosophila 


43 


339 


4.7 


873 


1 


FS21JROME 


P34082 


drosophila 


44 


323 


4.4 


1451 


1 


MYM1JOMAN 


P52179 


homo sapien 


45 


321 


4.4 


898 


1 


FAS2_SCHAM 


P22648 


schistocerc 



RESULT 
NEOlJOUSE 
ID 
AC 



STANDARD; PRT; 1493 AA. 



NE01J10USE 
P97798; 

01-OCT-2000 (Rel. 40, Created) 
01-OCT-2000 (Rel. 40, Last sequence update) 
Ql-OCT-2000 (Rel. 40, Last annotation update) 

NEOGENIN PRECURSOR. 

NEOl OR NGN. 

Mus musculus (Mouse). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
[1] 

SEQUENCE FROM N.A., AND ALTERNATIVE SPLICING. 
TISSUE-BRAIN; 

MEDLINE-97407661; PubMed-9264410; 
Keeling S.L., Gad J.M., Cooper H.M.; 

"Mouse neogenia, a DCC-Hke molecule, has four splice variants and is 
expressed widely in the adult mouse and during embryogenesis."; 
Oncogene 15:691-700(1997), 

-I- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

-I- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

-I- ALTERNATIVE PRODUCTS: AT LEAST 5 ISOFORMS; 1 (SHOWN HERE), 2, 3, 4 
AND 5; ARE PRODUCED BY ALTERNATIVE SPLICING. THE EXPRESSION OF 
ISOFORMS 3,' 4 AND 5 ARE DEVELOPMENT ALLY REGULATED. 

-!- TISSUE SPECIFICITY: WIDELY EXPRESSED. 

■!- DEVELOPMENTAL STAGE: EXPRESSED UBIQUITOUSLY THROUGHOUT THE MID TO 
LATE STAGES OF GESTATION AND IN ADULT TISSUES. STRONG EXPRESSION 
IS OBSERVED IN THE VENTRAL REGION OF THE VENTRICULAR ZONE OF THE 
E15.5 MOUSE NEURAL TUBE, AS WELL AS IN THE VENTRICULAR ZONES OF 
THE MESENCEPHALON AND RHOMBENCEPHALON. ISOFORMS 3 AND 4 ARE 
EXPRESSED AT HIGHER LEVEL COMPARED TO OTHER ISOFORMS BETWEEN Ell. 5 
AND E16.5. ■ 

-!- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN- LIRE C2-TYPE DOMAINS . 
-I- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 
-!- SIMILARITY:' BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
TUMOR SUPPRESSOR PROTEIN DCC, 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; Y09535; CAA70727.1; -, 
HSSP; P02751; 1TTG. 
MGD; MGI:1097159; NEOl. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; ■. 
PFAM; PF00041; fn3; 6. 
PFAM; PF00047; ig; 4. 
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DR PRINTS; PR00014; FNTYPEIII. 

KW Transmembrane; Immunoglobulin domain; Glycoprotein; Signal; 

KW Alternative splicing, 

FT SIGNAL 

FT CHAIN 

FT DOMAIN 

FT TRANSMEM 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT CARBOHYD 

FT CARBOHYD 

« CARBOHYD 
CARBOHYD 
CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT VARSPLIC 

FT VARSPLIC 

FT VARSPLIC 

FT VARSPLIC 

SQ SEQUENCE 1493 AA; * 163159 MW; 441DE919D5E17C0E CRC64; 



Query Match . 8,24; Score 597.5; DB 1; Length 1493; 

Best Local Similarity * 24.0%; Pred. No, 1.2e-24; 

224; Conservative 102; Mismatches 331; Indels 275; Gaps 27; 



1 


36 


POTENTIAL, 


37 


1493 


NEOGENIN. 


37 


1136 


EXTRACELLULAR (POTENTIAL). 


1137 


1157 


POTENTIAL, 


1158 


1493 


CYTOPLASMIC (POTENTIAL). 


78 


147 


IG-LIKE C2-TYPE DOMAIN. 


177 


239 


IG-LIKE C2-TYPE DOMAIN. 


274 


338 


IG-LIKE C2-TYPE DOMAIN. 


366 


428 


IG-LIKE C2-TYPE DOMAIN. 


467 


564 


FIBRONECTIN TYPE-III, ■ 


567 


660 


FIBRONECTIN TYPE-III. 


661 


760 


FIBRONECTIN TYPE-III. 


766 


860 


FIBRONECTIN TYPE-III. 


881 


981 


FIBRONECTIN TYPE-III. 


982 


1083 


FIBRONECTIN TYPE-III, 


1149 


li53 


POLY-VAL. 


84 


g4 
$21 




221 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


337 


337 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


501 


501 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


520 


520 


N-LINKED (GLCNAC. . . ) (POTENTIAL). 


670 


670 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


746 


746 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


940 


940 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


442 


461 


MISSING (IN ISOFORM 2). 


863 


878 


MISSING (IN ISOFORM 3). 


1086 


1096 


MISSING (IN ISOFORM 4). 


1279 


1331 


MISSING (IN ISOFORM 5). 



Matches 


oy 


10 


Db 


70 


oy 


70 


Db 


130 


Qy 


128 


Db 


188 


1 


188 


Db 


246 


Qy 


241 


Db 


303 


Qy 


301 


Db 


356 


Qy 


361 


Db 


395 


Qy 


» 

421 


Db 


415 


Qy 


481 


Db 


428 



1:11 



1.1 I hi 1:1 III I 



I :H I II l;!l: 



-RESDAGTYWCEAK-NEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVALMECGAPR 127 

: I I I I I : I II I I II I I :| : I I I:: I 



I : I :| M I ill : III I : I I |:|:|:: : I 
ADLVPFVRWEQNRQPLLL- -DDRIVKLPSGTLVISNATEGDGGLYRCIVESGGPPKFSDE 245 



I III I 



'11:1 



I I I I I : I I I II 



:IN: :| |:: III :| I I I III : I III II |; 
■ -ESSGRLVLLAGGCLEISDVTEDDAGTYFCIADNGNKTVEAQAELTVQVPPGFL 355 



:IM: I I II: I |: 



I: 1:1 : I :| I I 
■ ■SDNFKIVKEHNLQVLGLVKS- ■ 



III I 1:1 I 
-DEGFYQCIAENDV 427 



- -RLDTPTNPNIKFFRAPELSTYP- ■ 

: II I I I 



-GP-PGKPQMVEKGENS— VT 528 

It I hi I : 



Qy 529 LSW-TRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTT FTQTGLLPGVNY 581 

1:1 I : I I l|:||: I |:| I 
Db 488 LTWRTPASDPHGDNLTYSV* -FYTKEGVD RERVENTSQPGEMQVTIQNLMPATVY 540 

Qy 582 FFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTS 641 

I : 1:1 II" ' II: |:| : I 

Db 541 IFKVMAQNKHG- SGE — SSAPLRVETQ 564 

Qy 642 MKLTWQIINGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALIS 701 

:: ' III I I I II I: : 
Db 565 PEV - QLPGPAPNIRAYATSPTSITV 588 

Qy 702 TKPNIAAAGRRDGETNQSGGGAPTPLNTKYRMLTILNGG GASSCTITGLVQY 753 

•■■ II III I:: : I : I II II :l 

Db 589 - TWETPLSGNGE IQNYKLY YMEKGTDKEQDIDVSSHS YT INGLKKY 633 

Qy 754 TLYEFFIVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDR 813 

I U :l : I I : III INI II : : II :: : |: I : 
Db 634 TEYSFRWAY11KHGPGVSTQDVAVRTLSDVPSAAPQNLSLEVRNSKSIVIHWQPPSSTTQ 693 

Qy 814 HGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVG 873 

:| : I : ;| : :| : : :: I I I III II 
Db 694 NGQITGYKIRYRKASRKSD VTETLVTGTQLSQLIEGLDRGTEYNFRVAALTVNGTG 749 

Qy 874 PYC VPATLRLDPI 886 

I ll::| : h 

Db 750 PATDWLSAETFESDLDETRVPEVPSSLHVRPL 781 



RESULT 2 
NE01.RAT 

ID NEOIJAT STANDARD; PRT; 1377 AA, 
AC P97603; 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rel. 40, Last sequence update) 

DT 01-OCT-2000 (Rel'. 40, Last annotation update) 

DE NEOGENIN PRECURSOR (FRAGMENT) . 

GN NEOl OR NGN. 

OS Rattus norvegicus (Rat), 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 
RN [1] 

RP SEQUENCE FROM N.A. 
RC TISSUE-BRAIN; 

RX MEDLINE-97015074; PubMed-8861902; 

RA Keino-Masu K., Masu M., Hinck L., Leonardo E.D., Chan S.S.-Y., 
RA Culotti J.G., Tessier-Lavigne M. ; 

RT "Deleted in Colorectal Cancer (DCC) encodes a netrin receptor."; 
Cell 87:175-185(1996). 

-!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 
-!- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN, 
-!- SIMILARITY ; CONTAINS 4 IMMUNOGLOBULIN -LIKE C2-TYPE DOMAINS, 
-!• SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III -LIKE DOMAINS . 
-I- SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
TUMOR SUPPRESSOR PROTEIN DCC, 



This SWISS-PROT entry is copyright, It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation • 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; U68726; AAB41100.1; -. 
HSSP; P56276; 1TLK. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003.0Q6; -. 
PFAM; PF00041; fn3; 6. 
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PFAM; PFQ0047; ig; 4. 

Transmembrane; immunoglobulin domain; Glycoprotein; Signal. 





NUN_1 LK 


1 


1 




FT 


SIGNAL 


<1 


2 


POTENTIAL. 


FT 


CHAIN 




1377 • 


NEOGENIN. 


FT 


DOMAIN 


3 


1074 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1075 


1095 


POTENTIAL. 


FT 


DOMAIN 


1096 


1377 


CYTOPLASMIC (POTENTIAL) . 


FT 


DOMAIN 


36 


105 


I6-LIKE C2-TYPE DOMAIN. 




UUmirl 






TP.TTVP PT.TIVDT* rVMuTlVTW 

Hj'LlKb Wllri DOMAIN. 


FT 


DOMAIN 


232 


296 


TCT.T71? fJ.TVDlT IVIMSTM 
l\i bibb \.i llrti WM/UW. 




UUnnin 


324 


386 


TG-TTJfF fO-TVDP TYIMiTW 


FT 




405 


502 


rlbKUNU.llfl lire 111. 


FT 


DOMAIN 


505 


598 


JTTBRflNWTTN TVDF-TTT 
r lonuna^iin iirL ill. 




UVWnln 


599 


698 


UTI1DAMWTTM TVOP-TTT 
rlDKUHIA.lln IirL 111. 


FT 


DOMAIN 


704 


798 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


819 


919 


FIBRONECTIN TYPE-III. 




DOMAIN 


920 


1021 


FIBRONECTIN TYPE-III. 


f 


DOMAIN 


1087 


1090 


POLY-VAL. 




CARBOHYD 


42 


42 


N- LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


179 


179 


N- LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CARBOHYD 


295 


295 


N- LINKED (GLCNAC. , .) (POTENTIAL). 


FT 


IARBOHYD 


439 


439 


N- LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


458 


' 608 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


608 


N- LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


684 


684 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


878 


878 


N- LINKED (GLCNAC. . .) (POTENTIAL). 



SQ SEQUENCE 1377 AA; 150637 MW; E514ED8ABD1A63A9 CRC64; 



Query Match 8.1%; Score 589.5; DB 1; Length 1377; 

Best Local Similarity 23.7%; Pred. No. 2.9e-24; 

Matches 220; Conservative 103; Mismatches 319; Indels 285; Gaps 27; 



Qy 



t 



10 PMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFLKVIHSR 69 

hll : II I I: hi III I : I :|| I I |:||: 
28 PVDTLSVRGSSVILNCSAYSEPSPNIEWKKDGTFLNLVSDDRRQLLPDGSLFISNWHSK 87 

70 -RESDAGTYWCEAK-NEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVALMECGAPR 127 

MIIIIM.IIIIIII I :l : : I I 
88 HNKPDEGFYQCVATVDNLGT: /SRTAKLAVAGL-PRFTSQPEPSSIYVGNSGILNCEV-N 145 



128 

I :| I 

146 ADLVPFVRWEQNRQPLLL' 



GSPEPQISWRKNGQTLNLVGI KRIRIVDGGNLAIQEARQSDDGRYQCWKNVVGTRESAT 187 
II : I I I I : 1:1 l:|:|:: : I 
flDRIVKLPSGTLVISNATEGDEGLYRCIVESGGPPKFSDE 203 



188 AFLKVHVRP FLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNMPL 240 

I III Ihl 1 : 1:1 I I I I I I : I : : 

204 AELKVLQESEEMLDLVFLMR-PSSMIKV1GQSAVLPCVASGLPAPVIRWMKNEDVLDT-- 260 



Qy 241 RKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPKFV 300 

:IH: :| II:: III :| I I I III II III l|:|: 
Db 261 ESSGRLALLAGGSLEISDVTEDDAGTYFCVADNGNKTIEAQAELTVQVPPEFL 313 

Qy 301 IRPKNQLVEIGDEVLFECQMGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTPEGRSVL 360 

:| I :::|||: I I II: I |: :::| 
Db 314 KQPANIYARESMDIVFECEVTGKPAPTVKWVKNGD-WIP 352 

Qy 361 SIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTLPVKSIW 420 

I 1:1 : I :| I I 

Db 353 SDYFKIVKEHNLQVLGLVKS 372 

Qy 421 LPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRN 480 

III I 1:1 I 

Db 373 DEGFYQCIAENDV 385 

Qy 481 GKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGP-PGKPQMVEKGENS— VTLSW-TRSN 535 

I : I I II : II I I: I I : hi I :: 

Db 386 GNAQAGAQL IILEHAPATT- - -GPLPSAPRDWASLVSTRFIKLTWRTPAS 433 

Qy 536 KVGGSSL—VGYVIEMFGKNETDGWAVGTRVQNTT FTQTGLLPGVNYFFLIR 586 

I :l III : Ihll: I |:| I I : 

Db 434 DPHGDNLTYSVFYTKEGVARE RVENTSQPGEMQVTIQNLMPATVYIFKVM 483 



Qy 587 AENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKLTW 646 

1:1 II • II: 1:1 : I :: 

Db 484 AQNKHG— ■' SGE — SSAPLRVETQPEV-- 505 

Qy 647 QIINGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTKPNI 706 

III I I I II I: : 
Db 506 • • • QLPGPAPNIRAYATSPTSITV '- 526 

Qy 707 AAAGKRDG ET NQSGGG APT PLNTKYRMLT ILNGG GASSCTITGLVQYTLYEF 758 

II' III I:: : I : I II II :|| I I 

Db 527 TWETPLSGNGE — IQNYKLYYMEKGTDKEQDVDVSSHSYTINGLKKYTEYSF 576 

Qy 759 FIVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDRHGVLL 818 

:| : I I : III Nil II : : II :: : |: I ::| : 
Db 577 RWAYNKHGPGVSTQDVAVRTLSDVPSAAPQNLSLEVRNSKSIVIHWQPPSSATQNGQIT 636 

Qy 819 NYHVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYC-- 876 

!:!■■: :l : : :: I I I III III 
Db 637 GYKIRYRKASRKSD — VTETWTGTQLSQLIEGLDRGTEYNFRVAALTVNGTGPATDW 692 

Qy 877 ■ VPATLRLDPI 886 

M::| : h 

Db 693 LSAETFESDLDESRVPEVPSSLHVRPL 719 



RESULT 3 
NEOIJUMAN 

ID NEOIJUMAN J STANDARD; PRT; 1461 AA. 

AC Q92859; O00340j 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rel. 40, Last sequence update) 

DT Ql-OCT-2000 (Rel. 40, Last annotation update) 

DE NEOGENIN PRECURSOR. 

GN NEOl OR NGN. ■ 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

RN [1] 

RP SEQUENCE FROM N.A. (ISOFORMS 1 AND 2). 

RC TISSUE-FETAL BRAIN; 

RX MEDLINE-97236653; Publted-9121761; 

RA Meyerhardt J. A., Look A.T., Bigner S.H., Fearon E.R.; 

RT "Identification and characterization of neogenin, a DCC-related 

RT gene."; 

RL Oncogene 14:1129-1136(1997). 

RN [2] 

RP SEQUENCE FROM N.A. ( ISOFORMS 1 AND 2). 

RC TISSUE- FETAL BRAIN; 

RX MEDLINE-97312699; PubMed-9169140; 

RA Vielmetter j.,*Chen X.-N,, Miskevich P., Lane R.P., Yamakawa K. , 

RA Korenberg J.R.; Dreyer W.J.; 

RT "Molecular characterization of human neogenin, a DCC-related protein, 

RT and the mapping of its gene (NEOl) to chromosomal position 15q22,3- 

RT q23."; 

RL Genomics 41:414-421(1997). 

CC -!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 

CC TRANSITION' OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 

CC DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 

CC MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

CC -I- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

CC -I- ALTERNATIVE PRODUCTS: AT LEAST 2 ISOFORMS; 1 (SHOWN HERE) AND 2; 

CC ARE PRODUCED BY ALTERNATIVE SPLICING. 

CC ■!■ TISSUE SPECIFICITY: WIDELY EXPRESSED AND ALSO IN CANCER CELL 

CC LINES. 

CC •!- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN- LIKE C2-TYPE DOMAINS. 

CC •!■ 'SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC -!- SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
CC TUMOR SUPPRESSOR PROTEIN DCC, 

CC : 

CC This SWISS -PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 
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use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 



EMBL; 061262; AABl 263.1; -. 
EMBL; 072391; AAC! 287.1; -. 
HIM; 601907; -. 
HSSP; P02751; 1TTG 
INTERPR0; IPRO0177 ; -. 
INTERPR0; IPR0030C ; -. 
PFAM; PF00041; fn3 6. 
PFAM; PF00047; ig; 4. 
PRINTS; PR00014; I TYPEIII. 

Transmembrane; Imn noglobulin domain; Glycoprotein; Signal; 

Alternative splici g. 

SIGNAL 1 1 33 POTENTIAL. 

CHAIN 34 ! 461 NEOGENIN. 

DOMAIN 34 11105 EXTRACELLULAR (POTENTIAL). 

TRANSMEM 1106 $126 POTENTIAL. 

DOMAIN 1127 |461 CYTOPLASMIC (POTENTIAL). 

DOMAIN 67 136 IG-LIKE C2-TYPE DOMAIN. 

DOMAIN 166 228 IG-LIKE C2-TYPE DOMAIN. 

DOMAIN 263 327 IG-LIKE C2-TYPE DOMAIN. 

DOMAIN 355 417 IG-LIKE C2-TYPE DOMAIN. 

DOMAIN 436 533 FIBRONECTIN TYPE-III. 

DOMAIN 536 |629 FIBRONECTIN TYPE-III. 

DOMAIN 630 '729 FIBRONECTIN TYPE-III. 

DOMAIN 735 829 FIBRONECTIN TYPE-III. 

DOMAIN 850 950 FIBRONECTIN TYPE-III. 

951 052 FIBRONECTIN TYPE-III. 
1118 L21 POLY-VAL. 
74 4 L29 BY SIMILARITY. 
173 221 BY SIMILARITY. 
320 BY SIMILARITY. 
410 BY SIMILARITY. 
73 N-LINKED (GLCNAC. . .) (POTENTIAL). 
310 N-LINKED (GLCNAC. . .) (POTENTIAL). 

N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
N-LINKED (GLCNAC. . .) (POTENTIAL). 
MISSING (IN ISOFORM 2). 
168 C68 G -> N (IN REF. 2). 
1461 Ml 159958 MW; 7AAE897E69635A21 CRC64; 



DOMAIN 
DOMAIN 
DISULPID 
DISULFID 
DISULFID 
DISULFID 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
VARSPLIC 1248 
CONFLICT 



270 
362 
73 
210 

326 t!326 
#470 
3489 
,1639 
j!715 
U909 



470 
489 
639 
715 
909 



> Query Match 7.91; Score 575.5; DB J; Length 1461; 

Best Local Similarity 23.7*; Pred. No. 1.8e-23; 
Matches 219; Conservative 100; Mismatches 324; Indels 283; Gaps 28; 

Oy 10 PMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFLKVIHSR 69 

hll : || | |:| |:| III I : I :|| I I 1 : 1 1 : 
Db 59 PVDTLSVRGSSVILNCSAYSEPSPKIEWKKDGTFLNLVSDDRRQLLPDGSLFISNWHSK 118 

Qy 70 -RESDAGTYWCEAKNE-FGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVALMECGAPR 127 

: I I I I I I I II! I II I I :| : I I |:: I 
Db 119 HNKPDEGYYQCVATVESLGTIISRTAKLIVAGL-PRFTSQPEPSSVYAGNGAILNCEV-N 176 

Qy 128 GSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWKNWGTRESAT 187 

I : I :| I I I : I: : III I : I I |:|||:: : I 
Db 177 ADLVPFVRWEQNRQPLLL--DDRVIKLPSGMLVISNATEGDGGLYRCWESGGPPKYSDE 234 

Qy 188 AFLKVHVRPFLI RGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNMPLR 241 

III: : I |:| II I I I I : I : : 
Db 235 VELKVLPDPEVISDLVFLKQPSPLVRVIGQDWLPCVASGLPTPT IKWMKNEEALDT • - - 291 

Qy 242 KFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCBADNAVGGITATGILTVHAPPKFVI 301 

:| I: -I II:: III :| I I I III II III I |:|: 
Db 292 ESSERLVLLAGGSLEISDVTEDDAGTYFCIADNGNETIEAQAELTVQAQPEFLK 345 



Qy 302 RPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTPEGRSVLS 361 

:| I ;-:::|||: I I II: I |: :::| 
Db 346 QPTNIYAHESMDIVFECEVTGKPTPTVKWVKNGD--MVIP 383 

Qy 362 IARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTLPVKSIWL 421 
I .1:1 : I :| I I 

Db 384 SDYFKIVKEHNLQVLGLVKS 403 

Qy 422 PCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRNG 481 

III 11:1 | 

Db 404 ■ - DEGFYQCIAENDVG 417 

Qy 482 KSSWSGYLRLDTPTNPNIKFFRAPELSTYPGP-PGKPQMVEKGENS--VTLSW-TRSNK 536 

: I I II : II I I: I I : hi I :: 

Db 418 NAQAGAQL— IILEHAPATT- - -GPLPSAPRDWASLVSTRFIKLTWRTPASD 465 

Qy 537 VGGSSL- - - VG YVI EMFGKNETDGWVAVG T RVQNT T FTQTGLLPGVNYFFLIRA 587 

I :l 1:1 I : 11:11: I hi I I : I 

Db 466 PHGDNLTYSVFYTKEGIARE RVENTSHPGEMQVTIQNLMPATVYIFRVMA 515 

Qy 588 ENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKLTWQ 647 

:MI .Ihlh I : II II lh :lh 

Db 516 QNKHG— SGESSAPLRVETQ PEVQ - " ■ LPGPAPNL - RAYAASPTS ITVTWE 560 

Qy 648 IINGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTKPNIA 707 
II: I 

Db 561 -• TPVSGN 566 

Qy 708 AAGKRDGETNQSGGGAPTPLNTKYRMLTILNGG GASSCTITGLVQYTLYEFF 759 

II • h: : I : I II II :|| I I 

Db 567 GEIQ- NYKLYYMEKGTDKEQDVDVSSHSYTINGLKKYTEYSFR 608 

Qy 760 IVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDRHGVLLN 819 

: : I I : III III II : : II :: : h I ::l : 
Db 609 WAYNKHGPGVSTPDVAVRTLSDVPSAAPQNLSLEVRNSKSIMIHWQPPAPATQNGQITG 668 

Qy 820 YHVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYC- • ■ 876 

I : I ' : :l : : :: I I I III III 
Db 669 YKIRYRKASRKSD-— VTETLVSGTQLSQLIEGLDRGTEYNFRVAALTINGTGPATDWL 724 

Qy 877 - VPATLRLDPI 886 

• Ih:| : h 
Db 725 SAETFESDLDETRVPEVPSSLHVRPL 750 



RESULT 4 
AXOl.CHICK 

ID AXOl.CHICK ' STANDARD; PRT; 1036 AA. 

AC P28685; : . 

DT 01-DEC-1992 (Rel. 24, Created) 

DT 01-DEC-1992 (Rel, 24, Last sequence update) 

DT 15-JUL-1999 (Rel. 38, Last annotation update) 

DE AXONIN-1 PRECURSOR. 

OS Gallus gallus .(Chicken). 

OC Eukaryota; Metazoa; Chorda ta; Craniata; Vertebrate ; Euteleostomi; 

OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus. 

RN [1] 

RP SEQUENCE FROM ,N , A . , AND PARTIAL SEQUENCE . 

RC TISSUE-BRAIN; • 

RX MEDLINE-92174898; PubMed-1311675; 

RA Zuellig R.A., Rader C, Schroeder A,, Kalousek M.B., 

RA von Bohlen und Halbach F,, Osterwalder T., man c, stoeckli E.T., 

RA Affolter H.-u., Fritz A., Hafen E., Sonderegger P.; 

RT "The axonally secreted cell adhesion molecule, axonin-1, Primary 

RT structure, immup.oglobulin-like and f ibronec t in - type -Ill-like domains 

RT and glycosyl-phosphatidylinositol anchorage."; 

RL Eur. J. Biochem, 204:453-463(1992). 

CC -!- FUNCTION: AXON-ASSOCIATED CELL ADHESION MOLECULE (AXCAM) WHICH 
CC PROMOTES NECRITE OUTGROWTH BY INTERACTION WITH THE AXCAM Ll (G4) 
CC OF NEURITIC MEMBRANE. 

CC -!- SUBCELLULAR LOCATION; ATTACHED TO THE NEURONAL MEMBRANE BY A 
CC GPI -ANCHOR. 



Mon Jan 22 13:04:28 2001 



us-09-540-245a-16.rsp 
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CC -!- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
CC -!■ SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS, 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 
CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 
CC the European Bioinformatics Institute. There are no restrictions on its 
CC use by non-profit institutions as long as its content is in no way 
CC modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://ww.isb-sib.ch/announce/ 
CC or send an email to license@isb-sib.ch) . 

CC 

DR EMBL; X63101; CAA44815.1; -. 
DR PIR; S22128; S22128. 
DR PIR; S22383; S22383. 
DR HSSP; P56276; 1TLK. 
DR INTERPRO; IPR001777; -. 
INTERPRO; IPR0Q3Q06; -. 
PFAM; PF00041; fn3; 3. 
PFAM; PF00047; ig; 6. 

immunoglobulin domain; Glycoprotein; Signal; GPI-anchor; 
Cell adhesion; Repeat, 



f 

KW 

FT SIGNAL 

FT CHAIN 

FT PROPEP 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT MOD_RES 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

^ CARBOHYD 



1 
24 
? 

49 
143 
249 
336 
428 
518 
599 
601 
608 
710 
812 
913 
?24 

71 
199 
456 
472 
493 
520 
770 
900 
914 



23 
1036 
1036 
113 
211 
308 
397 
490 
589 
608 
607 
709 
811 
912 
1009 
?24 
71 
199 
456 
472 
493 
520 
770 
900 
914 



SEQUENCE 1036 AA; 113301 MW; 08B80143BE779794 CRC64; 



Query Match 7.8*; 
Best Local Similarity 23,14; 



OR 25 (POTENTIAL) . 
AXONIN-1. 

REMOVED IN MATURE FORM. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN, 
IG-LIKE C2-TYPE DOMAIN. 
HINGE (POTENTIAL). 
GLY/PRO-RICH. 
FIBRONECTIN TYPE-IIL 
FIBRONECTIN TYPE-IIL 
FIBRONECTIN TYPE-IIL 
FIBRONECTIN TYPE-IIL 
BLOCKED. 

N- LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N- LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 



) (POTENTIAL). 

) (POTENTIAL), 

) (POTENTIAL). 

) (POTENTIAL). 

) (POTENTIAL). 

) (POTENTIAL). 

) (POTENTIAL) , 

) (POTENTIAL). 

) (POTENTIAL), 



Score 568; DB 1; Length 1036; 
Pred. No. 2.9e-23; 



Matches 250; Conservative 116; Mismatches 352; Indels 364; Gaps 40; 

Qy 4 PRIIEHPMDTTVPK— NDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGL 60 

I I I I I: : I hi II I :l :| III I I I II I 
Db 32 PVFEEQPAHTLFPEGSAEEKVTLTCRARANPPATYRWKMNGTELKMGPDS-RYRLVAGDL 90 

Qy 61 FFLKVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLE 107 

: llhl I I I I |j |:|: : l| | 
Db 91 VISNPVKAK— DAGSYQCVATNARGTWSREASLRFGFLQ-EFSAEERDPVKITEGWGV 146 



Qy 



■ 107 



Db 147 MFTCSPPPHYPALSYRWLLNEFPNFIPADGRRFVSQTTGNLYIAKTEASDLGNYSCFAIS 206 

Qy 108 PANTRVAQGEVALMECGAPRGSPEPQ 133 

11:1 I:: Ml I 1:1 II 
Db # 207 H IDF ITKSVFSKFSQLSLAAEDARQYAPS IKAKFPADT YALTGQMVTLECF A- FGNPVPQ 265 

Qy 134 ISWRK--NGQTWNKRIRIVDGGNLAIQEARQSDDGRYQCVVKNVVGTRESATAFLK 191 

I III II : :: : I II hi hi :|: I I:: : 
Db 266 IKWRKLDGSQTSKWLSSEPL LHIQNVDFEDEGTYECEAENIKG - RDT YQGRII 317 



Qy 192 VHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNMPLRKFSWLHSASG 251 

:| :l : : , :M : : I I I I I I II :: 
Db 318 IHAQPDWLDVITDTEADIGSDLRWSCVASGKPRPAVRWLR— -DGQPL ASQN 366 

Qy 252 RVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHA-PPKFVIRPKNQLVEI 310 

h I h.. : III I I I hi I : h III M I : I :h 
Db 367 RIEV- SGGELRFSKLVLEDSGMYQCVAENKHGT VYASAELTVQALAPDFRLNPVKRLI PA 425 

Qy 311 "GDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTPEGRSVLSIARFARE 368 

:h- II. I: h h : II I Ihl :| :| :: 
Db 426 ARSGKVIIPCQPRAAPKATVLWT- -KGTELLTNSSR VTITADGTLILQ- -NISKS 476 

Qy 369 DSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTLPVKSIWLPCRTLGT 428 

I II II I :l :l ::l : I I : : I : || 
Db 477 DEGK - YTCFAENFMGKANSTG ILSVRDATK ITLAPSSADINVGENLTLQCHASHD 530 

Qy 429 PVPQV - - SWYLDG I P I DVQEHE — RRNLSDA - G ALT I SDLQ - RHEDEGLYT C VASNRN 480 

I :< :l II llh : I I :: :| I I I : I :| I III I 
Db 531 PTMDLTFTWSLDDFPIDLDKSEGHYRRASVKEAVGDLAIVNAQLKH--SGRYTCTAQTW 588 

Qy 481 GKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTLSWTRSNKVGGS 540 

:l I I • ' I Mill : : h :| llhl I 

Db 589 DSTSESATLTVRGP PGPPGGVWRDIGDTTVQLSWSRGFD - NHS 631 

Qy 541 SLVGYVIE— -MFGKNETDGWVAVGTRVQN TTFTQTGLLPGVNYFFLIRAENSH 591 

: I II : I 1:11 I hi ::| I : I I 
Db 632 PIARYSIEARTLLSNK WKQMRTNPVNIEGNAETAQVVNLIPWMDYEFRVLASNIL 686 

Qy 592 GLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKLTWQIING 651 
I: II 

Db 687 GVGEPS - 692 

Qy 652 KYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTK* - -PNIAA 708 

II h I II I :l 

Db 693 LP SSKIRTKEAAPTVAP 709 

Qy 709 AGKRDGETNQSGGGAP TPLNTKYRMLTILNGGG ASSCTITGLV 751 

:| < Nil! II hill : :| I 

Db 710 SGL GGGGGAPNELIINWTPTLRDYQ NGDGFGYILSFRKKGTQGWLTARV 758 

Qy 752 QYTLYEFFIVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALL 795 

II :| I : : II I : I : h I lh : ! 
Db 759 PHAESLHYVYRNESIGPYTPFEVKIKAYNRKGEGPESLTAIVYSAEEEPKVAPFRVTAKA 818 

Qy 796 LNSSAVFLKW5APELKDRHGVLLNYHV-IVRGIDTAHNFSRILTNVTIDAASPTLVLANL 854 

: II : : h, I I Mil I : : I h I : :| I I 
Db 819 VLSSEMDVSWEPVEQGDMTGVLLGYEIRYWKDGDKEEAADRVRTAGLVTSAHVT--GL 874 

Qy 855 TEGVMYTVGVAAGNNAGVGPYC VPATLRLDPITKRLDPFINQ 896 

I I I I I IMI : II :| : II : 

Db 875 NPNTKYHVSVRAYNRAGAGPPSPSTNITTTKPPPRRPPGNISWTLTGSTVTIKWDPWAQ 934 

Qy 897 RD 898 • ■. 

I 

Db 935 AD 936 



RESULT 5 / 
NE01.CHICK ■■ '. 

ID NEOLCHICK STANDARD; PRT; 1443 AA. 

AC Q90610; 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rel. 40, Last sequence update) 

DT 01-OCT-2000 (Rel'. 40, Last annotation update) 

DE NEOGENIN (FRAGMENT) , 

OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus. 
RN [1] 

RP SEQUENCE FROM N/A. 

RC STRAIN-WHITE LEGHORN; TISSUE-EMBRYONIC BRAIN; 

RX MEDLINE-95105243; PubMed-7806578; 



Best Available Copy 

Mon Jan 22 13:04:28 2001 us-09-540-245a-16.rsp Page 6 



i 

FT 
FT 
FT 
FT 
FT 
FT 



Vielmetter J,, Roman J.M., Dreyer W.J.; 

"Neogenin, an avian cell surface protein expressed during terminal 

neuronal differentiation, is closely related to the human tumor 

suppressor molecule deleted in colorectal cancer,"; 

J. Cell Biol. 127:2009-2020(1994). 

-!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

•!• SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

•!■ DEVELOPMENTAL STAGE: IN RETINA, EXPRESSED ON GANGLION CELL FIBERS 
AS SOON AS THEY BEGIN TO EXTEND THEIR AXONS. 

•I- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN -LIKE C2-TYPE DOMAINS. 

-!■ SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 

-I - SIMILARITY: BELONGS TO THE {IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
TUMOR SUPPRESSOR PROTEIN DJC. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation ■ 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://ww.isb-sib.ch/announce/ 
or send an email to licenseGisb-sib.ch). 



EMBL; U07644; AAC59662.1; 
HSSP; P80362; 1WTL, 
INTERPRO; IPR001777; -. 
INTERPRO; IPR0O3006; -. 
PFAM; PF00041; fn3; 6. 
PFAM; PF00047; ig; 4. 

Transmembrane; Immunoglobulin domain; Glycoprotein 

NONJER 
DOMAIN 
TRANSMEM 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DISULFID 
DISULFID 
DISULFID 
DISULFID 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 



1 

<1 


1 

1090 


EXTRACELLULAR (POTENTIAL). 


1091 


1111 


POTENTIAL. 


1112 


1443 


CYTOPLASMIC (POTENTIAL) , 


33 


102 


IG-LIKE C2-TYPE DOMAIN. 


132 


194 


IG-LIKE C2-TYPE DOMAIN. 


229 


293 


IG-LIKE C2-TYPE DOMAIN, 


321 


383 


IG-LIKE C2-TYPE DOMAIN. 


422 


519 


FIBRONECTIN TYPE-III. 


522 


615 


FIBRONECTIN TYPE-III. 


616 


714 


FIBRONECTIN TYPE-III. 


720 


814 


FIBRONECTIN TYPE-III. 


835 


935 


FIBRONECTIN TYPE-III. 


936 


1037 


FIBRONECTIN TYPE-III. 


40 


95 


BY SIMILARITY, 


139 


187 


BY SIMILARITY. 


236 


286 


BY SIMILARITY. 


328 


376 


BY SIMILARITY. 


39 


39 


N- LINKED (GLCNAC, . .) (POTENTIAL). 


176 


176 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


292 


292 


N-LINKED (GLCNAC, , .) (POTENTIAL). 


456 


456 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


475 


475 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


625 


625 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


700 


700 


N-LINKED (GLCNAC. , .) (POTENTIAL). 


894 


894 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


1443 AA; 158050 MW; 558C67 95579C0E26 CRC64 ; 



Query Match 7.7*; Score 560; DB 1; Length 1443; 

Best Local Similarity 23.7%; Pred. No. 1.2e-22; 

Matches 225; Conservative 91; Mismatches 321; Indels 314; Gaps 30; 

Qy 10 PMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFLKVIHSR 69 

II! : II : I 1:1 III I : I :|| I I Ml: 
Db 25 PMDILSVRGASVIMNCSSYCETPPKIEWKKDGTLLNLVSDDRRQLLPDGSLLINSVVHSK 84 

Qy 70 • RESDAGTYWCEAKNE - FGVARS RNATLQVAVLRDEFRLEPANTRVAQGEVALMECGAPR 127 

: I I I I I II I! I I II I I :| : I :l h: I 
Db 85 HNKPDEGYYQCVATVESLGSIVSRTAKLTVAGL-PRFTSQPELSSVYKGNSAILNCEV-N 142 



Qy 128 GSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWKNWGTRESAT 187 

I : I ;: I 1:1 : I: : III I :| I |:||::: : I 
Db 143 VDLAPFVRWEQDRQPLSL--DDRVFKLPSGALLIGNATDTDGGFYRCVIESGGTPKYSEE 200 

Qy 188 AFLKVHVRP" — FLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNMPLR 241 
III: T :| I : I I I : II III I I I I I :| : 
. Db 201 AELKILPDPEEPQSLVFVRQPSSLTKVTGQNAVFPCVAGGFPTPYVRW--TKNGEEL--- 255 

Qy 242 KFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPKFVI 301 

: I I. : II : III l|:l III III I I I I !|:|: 
Db 256 — -ITEDSERFALRAGGSLLISDVTEEDVGTYTCIADNENETIEAQAELAVQVPPEFLK 311 

Qy 302 rpknqlveigd'evlfecqanghprptlywsvegnsslllpgyrdgrmevtltpegrsvls 361 

II I ::::lll: I I II: I I: :::| 
Db 312 rpaniyahesmdivfecevtgkptptvkwvkngd-wip 349 

Qy 362 iarfaredsgkwtcnalnavgsvssrtwsvdtqfelpppiieqgpvnqtlpvksiwl 421 

f hi : I HI 

Db 350 SDYFKIVKEHNLQVLGLVKS 369 

Qy 422 PCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRNG 481 

III I hi I I 

Db 370 DEGFYQCIAENDVG 383 

Qy 482 KSSWSG— YL'rLDT-PTNPNIKFFRAPELSTYP— GP-PGKPQMVEK--GENSVTL 529 

: I' II III I I 1 1 I I : I : I 

Db 384 NAQAGAQLIILDLDVAIPTLPPTSLTSATNDHLAPATTGPLPTAPRDWATLVSTRFIRL 443 

Qy 530 SW-TRSNKVGGSSL— VGYVIEMFGKNETDGWVAVGTRVQNTT - - - FTQT - - - GLLPGV 579 

:| I : I :| I : Ihlh II |:| 

Db 444 TWRTPVSDPQGDNLTYSIFYTKEGINRE RVENTSRPGETQVMIQNLMPET 493 

Qy 580 NYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGDWELSNASWDS 639 

I I : hi II I h I h 

Db 494 VYVFRWAQNKHG- "HGESSAPLKVATQ 519 

Qy 640 TSMKLTWQIINGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASAL 699 

I I 1 1 

Db 520 r PEVQLPGPA 528 

Qy 700 1ST KPNIAAAGKRDGETNQSGGG APT PLNTKYRMLT ILNGGG 741 

III I 1:11 : : I hi I 

Db 529 — -PNIRAY AGSPTSVTVTWE--TPLSGNGEIQNYKLYYMEKGQDSEQ 571 

Qy 742 ASSCTITGLVQYTLYEFFIVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLL 796 

! Mill :ll I I :| : I I : : III Mil II : 
Db 572 DVDVAGLSYTITGLKKYTEYSFRWAYNKHGPGVSTQDVWRTLSDVPSAAPQNLTLEAR 631 

Qy 797 NSSAVFLKWKAPELKDRHGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPTL-VLANL 854 

II :: I h I I : I : I : ::l! I :: I 

Db 632 NSKSIMLHWQPPPAGTHSGQITGYKIRYRKVSRK SDVTESVGGTQLFQLIEGL 684 

Qy 855 TEGVMYTVGVAAGNNAGVGPYC VPATLRLDPI 886 

I I :|| I II lh:l : h 

Db 685 ERGTEYNFRIAAMTVNGTGPATDWVSAETFESDLDESRVPEVPSSLHVRPL 735 



STANDARD; PRT; 1284 AA. 



RESULT 6 
NRCA_CHICK 
ID NRCA.CHICK 
AC P35331; j 
DT 01-FEB-1994 (Rel. 28, Created) 
DT 01-FEB-1994 (Rel. 28, Last sequence update) 
DT 15-JOL-1999 (Rel. 38, Last annotation update) 
DE NG-CAM RELATED CELL ADHESION MOLECULE PRECURSOR (NR-CAM) (BRAVO). 
. OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 
OC Gallus . 
RN [1] 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 25-52; 178-184 AND 581-594. 
RC STRAIN-WHITE LEGHORN; TISSUE-EMBRYONIC BRAIN; 
RX MEDLINE-91258407; PubMed-2045418; 
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RA Grumet M. ( Mauro v., Burgoon M.P., Edelman G,M,, Cunningham B.A.; 
RT "Structure of a new nervous system glycoprotein, Nr-CAM, and its 
RT relationship to subgroups of neural cell adhesion molecules."; 
RL J. Cell Biol. 113:1399-1412(1991). 
RN [2] , 

RP SEQUENCE OF 25-1284 FROM N.A., AND PARTIAL SEQUENCE. 
RC TISSUE-EMBRYONIC B&AIN, AND RETINA; 
RX MEDLINE-92381110; PubMed-1512296; 

RA Kayyem J.F., Roman|j,M., de la Rosa E.J., Schwarz U., Dreyer W.J.; 
RT "Bravo/Nr-CAM is cfcsely related to the cell adhesion molecules LI 
RT and Ng-CAM and has pa similar heterodimer structure."; 
RL J. Cell Biol. 118:1259-1270(1992). 

CC •!- FUNCTION: THIS PROTEIN IS A CELL ADHESION MOLECULE INVOLVED IN 
CC NEURON-NEURON ADHESION, NEURITE FASCICULATION, OUTGROWTH OF 
CC NEURITES, ETC. SPECIFICALLY INVOLVED IN THE DEVELOPMENT OF OPTIC 
CC FIBRES IN THE RETINA. 

CC -!- SUBUNIT: HETERODIMER, COMPOSED OF AN ALPHA AND A BETA CHAIN, 
SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 
ALTERNATIVE PRODUCTS: AT LEAST 5 ISOFORMS ARE PRODUCED BY 
ALTERNATIVE SPLICING. : 
-!- TISSUE SPECIFICITY: RETINA AND DEVELOPING BRAIN. 
-!- DEVELOPMENTAL STAGE: EXPRESSED IN DEVELOPING NEURAL RETINA AND 

EMBRYONIC BRAIN TISSUE'. 
-!- SIMILARITY: CONTAINS 6; IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
-!- SIMILARITY: CONTAINS 5| FIBRONECTIN TYPE III-LIKE DOMAINS , 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of jBioinformatics and the EMBL outstation - 
the European Bioinformatics Institute, There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a licence agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-jsib.ch). 

DR EMBL; X58482; CAAA1391.1; ■ 

DR EMBL; L08960; AAA48632.1; ■ 

DR HSSP; P20241; 1CFB. 

DR INTERPRO; IPRQ01777; -. 

DR INTERPRO; IPROQ30Q6; -. 

DR PFAM; PF00041; fn3; 5. 

DR PFAM; PF00047; ig; 6. 

DR PRINTS; PR00014; FNTYPEIII. 

KW immunoglobulin domain; Glycoprotein; Signal; Cell adhesion; Repeat; 

KW Transmembrane; Alternative splicing, 



FT 


SIGNAL 


1 


24 




Qy 


289 GILTVHAPPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRM 348 




CHAIN 


25 


1284 


NG-CAM RELATED CELL ADHESION MOLECULE, 




:l! 1 1 :: hi :: h: 1 : 1 1 1 : 1 : 1 : : II : 1 1 


i 


DOMAIN 


25 


1143 


EXTRACELLULAR (POTENTIAL). 


Db 


329 ISVTVKAAPrwiTAPRNLVLSPGEDGTLICRANGMPSISWLTNGVPIAIAP--EDPSR 386 




TRANSMEM 


1144 


1166 


POTENTIAL. 






FT 


DOMAIN 


1167 


1284 


CYTOPLASMIC (POTENTIAL), 


Qy 


349 EVTLTPEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGP 408 


FT 


DOMAIN 


56 


125 


IG-LIKE C2-TYPE DOMAIN, 




:l ':l - 1 :l 1 1 III 1 1 : : hi : II h 1 


FT 


DOMAIN 


155 


220 


IG-LIKE C2-TYPE DOMAIN. 


Db 


387 KV— -DGDTIIFSA--VQERSSAVYQCNASNEYGYLLANAFVNVLAE--PPRILT--P 435 


FT 


DOMAIN 


261 


323 


IG-LIKE C2-TYPE DOMAIN. 






FT 


DOMAIN 


351 


415 


IG-LIKE C2-TYPE DOMAIN. 


. Qy 


409 VNQTLPV-K3IWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRH 466 


FT 


DOMAIN 


445 


508 


IG-LIKE C2-TYPE DOMAIN. 


h 1 1 :: 1 hi h: h h : 1 1 1 1 h 


FT 


DOMAIN 


536 


599 


IG-LIKE C2-TYPE DOMAIN. 


Db 


436 ANKLYQVIADSPALIDCAYFGSPKPEIEWF-RGVKGSILRGNEYVFHDNGTLEIPVAQK- 493 


FT 


DOMAIN 


638 


699' 


FIBRONECTIN TYPE-III. 






FT 


DOMAIN 


738 


799 


FIBRONECTIN TYPE-III. 


Qy 


467 EDEGLYTCVASNRNGKSSWSGYLRLDTPT NPNIKFFRA PEL" 507 


FT 


DOMAIN 


837 


906 


FIBRONECTIN TYPE-III. 


: 1 1 1 1 1 M : 1 1 : 1 : 1 1 1 1 : II 


FT 


DOMAIN 


943 


1006 


FIBRONECTIN TYPE-III. 


Db 


494 DSTGTYTCVARNKLGKTQNEVQLEVKDPTMIIKQPQYKVIQRSAQASFECVIKHDPTLIP 553 


FT 


DOMAIN 


1057 


1114 


FIBRONECTIN TYPE-III. 






FT 


DISULFID 


63 


118 


POTENTIAL. 


Qy 


508 STY 510 


FT 


DISULFID 


162 


'213 


POTENTIAL. 


II 


FT 


DISULFID 


268 


316 


POTENTIAL. 


Db 


554 TVIWLKDNNELPDDERFLVGKDNLTIMNVTDKDDGTYTCIVNTTLDSVSASAVLTWAAP 613 


FT 


DISULFID 


358 


408 


POTENTIAL, 






FT 


DISULFID 


452 


501 


POTENTIAL. 


Qy 


511 P'GPPGKPQMVEKGENSVTLSWTRSNKVGGSSLVGYVIEM-FGKNETDGWVA 560 


FT 


DISULFID 


543 


592 


POTENTIAL, ' 


1" II :: : | 1: III : I : :||| I :| 1 


FT 


CARBOHYD 


78 


78 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Db 


614 PTPAI I YARP,NPPLDLELTGQLERS IELSWVPGEE - NNSP ITNFVIEYEDGLHEPGVW - H 671 


FT 


CARBOHYD 


218 


218 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




FT 


CARBOHYD 


290 


290 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Qy 


561 VGTRVQNT-TFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEA 619 


FT 


CARBOHYD 


■409 


409 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




1 1 : 1' 1 1 1 III 1 : 1 1 1 1 II II :l : 1 


FT 


CARBOHYD 


483 


483 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Db 


672 YQTEVPGSHTTVQLKLSPYVNYSFRVIAVNEIGRSQPSEPSE QYLTKSANPDE- 724 


FT 


CARBOHYD 


576 


576 


N-LINKED (GLCNAC. . .) (POTENTIAL). 






FT 


CARBOHYD 


581 


581 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Qy 


620 RASLLSGDWELSNASWDS-TSMKLTWQIINGKYVEG-— FYVYARQLPNPIVNNPA 573 



FT 


CARBOHYD 


595 


595 


N-LINKED (GLCNAC. . ,) (POTENTIAL) 


FT 


CARBOHYD 


692 


692 


N-LINKED (GLCNAC. , .) (POTENTIAL) 


FT 


CARBOHYD 


778 ; 


778 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CARBOHYD 


834 


834 


N-LINKED (GLCNAC. . ,) (POTENTIAL) 


FT 


CARBOHYD 


885' 


885 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CARBOHYD 


969' 


969 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CARBOHYD 


985 


985 


N-LINKED (GLCNAC. , .) (POTENTIAL) 


FT 


CARBOHYD 


995; 


995 


N-LINKED (GLCNAC. , ,) (POTENTIAL) 


FT 


CARBOHYD 


1048 


1018 


N-LINKED (GLCNAC. . .) (POTENTIAL) 






1059' 


1059 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


CARBOHYD 


1091 ■ 


1091 


N-LINKED (GLCNAC. . .) (POTENTIAL) 


FT 


VARSPLIC 


612: 


621 


MISSING (IN ISOFORM AS10). 


FT 


VARSPLIC 


1027 


1038 


MISSING (IN ISOFORM AS12). 


FT 


VARSPLIC 


1039' 


1131 


MISSING (IN ISOFORM AS93). 


FT 


VARSPLIC 


1202' 


1205 


MISSING (IN ISOFORM AS-CYT2). 


FT 


CONFLICT 


209. 


209 


V -> E (IN REF. 2). 


FT 


CONFLICT 


680 


680 


H -> Q (IN REF. 2). 


SQ 


SEQUENCE 


1284. AA; 141851 MW; A3570BF9C3D47A0F CRC64; 



Query Match • 7.6%; Score 555.5; DB 1; Length 1284; 

Best Local Similarity 20.8%; Pred, No. l.Be-22; 

Matches 277; Conservative 179; Mismatches 515; Indels 359; Gaps 51; 

Qy 8 EHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFLKVIH 67 

: I I I : 1:1:1 I h I ::| I : I I I : ::: 
Db 46 QSPKDYIVDPRENIVIQCEAKGKPPPSFSWTRNGTHFDIDKDAQVTMKPNSGTLWNIMN 105 

Qy 68 S ■ RRESDAGTYWCEAKNEFGVARSRNATLQV- - AVLRDEFRLEPANTRVAQGEVALMECG 124 

: h I J I 1:11 MM:: : | : :||| : | :|: :: I 
Db 106 GVKAEAYEGVYQCTARNERGAAISNNIVIRPSRSPLWTKEKLEPNHVR-EGDSLVLNCR 163 

Qy 125 APRGSPEPQI 3WRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGR - - YQCW 176 

I I II I', I I I ::h hi Mil 
Db 164 PPVGLPPPIIFWMDNA-FQRLPQSERVSQGLNGDLYFSNV-QPEDTRVDYICYARFNHTQ 221 

Qy 177 — KNWGTRESATAFLKVHVRPFLIRGP — QNQTAWGSSWFQCRIGGDPLPDVL 228 

I : : :| I II :: I h : h :: :| II I : 
Db 222 TIQQKQPISVKVFSTK--PVTERPPVLLTPMGSTSNKVELRGNVLLLECIAAGLPTPVIR 279 

Qy 229 WRRTASGGNMPLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITAT 288 

I : II :| : : : ::lh lh Mill: 

Db 280 WIK-EGGELPANRTFFENF KKTLK I IDVSE ADSGNYKCT ARNTLGST HHV 328 
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II : 1 :lh : 1 1 Mil 


CC 


FORM WHICH IS 


EXPRESSED ONLY IN THE EMBRYO IS PRODUCED BY 


Db 


725 NPSNVQGIGSEPDNLVITWESLKGFQSNGPGLQYKVSWRQKDVDDEWTSV 774 


CC 
CC 


ALTERNATIVE • SPLICING . 
-1- TISSUE SPECIFICITY: IN THE EMBRYO, EXPRESSED AT HIGH LEVELS IN THE 


Oy 


674 PVTSNTNPLLGSTSTSASASASASAL--ISTKPNIAAAGKRDGE 715 


CC 


DEVELOPING BRAIN AND NEURAL TUBE. IN ADULT, HIGHLY EXPRESSED IN 




| ; ; ; ; | | | | ; | ; | | 


CC 


BRAIN 


WITH VERY LOW LEVELS FOUND IN TESTIS, HEART AND THYMUS, 


Db 


775 WANVSKYIVSGTPTFVPYEIKVQALNDLGYAPEPSEVIGHSGEDLPMVAPGNVQVHVIN 834 




-1- DEVELOPMENTAL STAGE: LOW LEVELS IN EARLY GESTATION. HIGHEST LEVELS 






CC 


EXPRE 


SSED DURING MID GESTATION. LEVELS DECREASE IN LATE GESTATION 


Qy 


716 -TNQSGGGAPTPL NTKYRMLTILNGGGASSCTIT 748 


CC 


AND E 


EMAIN AT THIS LEVEL IN THE ADULT. 




1 III ; : : : 1 1 | : : 


CC 


■!• SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN* LIKE C2-TYPE DOMAINS. 


Db 


835 STLAKVHWDPVPLKSVRGHLQGYKVYYWRVQSLSRRSKRHVEKKILIF—RGNKTFGMLP 892 


CC 
CC 
CC 


-!• SIMILARITY: 'CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 


Qy 


749 GLVQYTLYEFFIVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAP 808 


This SWISS- PROT' entry is copyright. It is produced through a coliabora-.i::. 




II h 1: : II 1 :: :| 1 III 1 :: :: hi :| 


CC 


between 


the Swiss 


Institute of Bioinformatics and the EMBL outstation • 


Db 


OJJ obDrloaiALIwKVVHuMuurflSr^VIMrooVrD rroi bMlHrHjUSLlbliHuOr 331 




the European Bioinforaatics Institute. There are no restrictions on i'^ 






CC 


use by 


non-profit institutions as long as its content is in no way 


Qy 


809 ELKDRHGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGN 868 


CC 
CC 


modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://ww.isb-sib.ch/announce/ 


Db 


952 --THPNGVLTSYILKFQPINNTHELGP-LVEIRIPANESSLILKNLNYSTRYKFYFNAQT 1008 


CC 

DR 
DR 
DR 


or send an email to license@isb-sib.ch), 


Ov 


; | | ; , . | . | . . . | | 

1009 SVGSGSQITEEAVTIMDEAGILRPAVGAGKVQPLYPRIRNVTTAAAETYANISWEYEGPD 1068 


EMBL; X85788; CAA59786.1; -. 
HSSP; P56276; 1TLK . 
MGD; MGI:94869;'DCC, 


F 




DR 


ItJTERPRO 


IPR001777; -. 




| | 


DR 


INTERPRO; IPR0030Q6; -. 
PFAM; PF00041; fn3; 6. 


Db 


1069 HANFYVEYGVAGSKEDWKKEIVNGSRSFFVLKGLTPGTAYKVRVGAEGLSGFRSSEDLFE 1128 


DR 


PFAM; PF00047; ig; 4. 
PRINTS; PR00014; FNTYPEIII. 


Qy 


902 DVLTQPWFIILLGAIIAVLMLSFGAMVFVKRKIOWMKQSALNTMRGNHTS 951 


KW 


Glycoprotein; Immunoglobulin domain; Transmembrane; Signal; 




1 • 1 1 1 1 1 1 • 1 ■ 1 • 1 • 1 > 1 • • ! • 1 




Anti-oncogene; Alternative initiation; Alternative splicing, 


Db 


10riVlnOr\uVulttlU«"C lUWWunV ftllLlLllllJlV^r lIVftH MJO 11 /z 


PT 


SIGNAL 


1 

26; 


25 POTENTIAL. 








CHAIN 


1447 TUMOR SUPPRESSOR PROTEIN DCC, LONG 


Ov 


952 DVLKMPSLSARNGNGYWLDSSTGGMVWRPSPGGDSLEMQKDHIADYAPVCGAPGSPAGGG 1011 


PT 






ISOFORM. 




II : : 1 1 I: :: | |: |: :|: 




CHAIN 


85 : 


1447 TUMOR SUPPRESSOR PROTEIN DCC, SHORT 


Db 


1173 - - -KYPVKEKEDAHA" - ■DPEIQPMKEDDGTFGEYRSLESD-AEDHKPLKKGSRTPSDRT 1225 


_ 






ISOFORM. 








INITJffiT 


85 


85 FOR SHORT ISOFORM, 


Ov 
"J 


1U11 lOuuoDUun uDunjuulJUllluunuO£iA»uyiM Vuiil oftlrliJInLVDDr vrinrD£I 1UU/ 

I : I I :| : I ::|:ll I hi 


PT 


DOMAIN 
TRANSMEM 


26 
1098: 


1097 EXTRACELLULAR (POTENTIAL). 
1122 POTENTIAL. 


Db 


1226 VKKEDSDDSLVDYGEGVNGQFNEDGS FIGQYS GKKEKEPAE- 1266 


PT 


DOMAIN 


1123 


1447 CYTOPLASMIC (POTENTIAL), 






pi 


DOMAIN 


54' 


124 IG-LIKE C2-TYPE DOMAIN. 


Qy 


1|068 GRHGNASPAP 1077 




DOMAIN 


154 


219 IG-LIKE C2-TYPE DOMAIN, 




1267 GNESSEAPSP tfll 


PT 


DOMAIN 


254 


317 IG-LIKE C2-TYPE DOMAIN. 


Db 


pm 


DOMAIN 


345 


407 IG-LIKE C2-TYPE DOMAIN, 




1 




DOMAIN 


426 


522 FIBRONECTIN TYPE-III. 






pm 


DOMAIN 


525 


618 FIBRONECTIN TYPE-III. 


RESULT "■ 7 


FT 


DOMAIN 


619 ' 


716 FIBRONECTIN TYPE-III. 


DCCJKKJSE 




DOMAIN 


722 ' 


816 FIBRONECTIN TYPE-III. 


ID 


DCCJOUSE STANDARD; PRT; 1447 AA. 




DOMAIN 


840'' 


940 FIBRONECTIN TYPE-III, 


AC 


P70211; 


PT 


DOMAIN 


941! 


1042 FIBRONECTIN TYPE-III. 


DT 


01-H0V-1997 (Rel. 35, Created) 


PT 


DISULFID 


61' 


117 BY SIMILARITY. 




01-NOV-1997 (Rel. 35, Last sequence update) 


FT 


DISULFID 


161' 


212 BY SIMILARITY. 


i 


30-MAY-2000 (Rel. 39, Last annotation update) 


FT 


DISULFID 


261'. 


310 BY SIMILARITY. 


h 


TUMOR SUPPRESSOR PROTEIN DCC PRECURSOR. 




DISULFID 


352' 


400 BY SIMILARITY. 


GN 


DCC. 


FT 


CARBOHYD 


60. 


60 N- LINKED (GLCNAC. . .) (POTENTIAL). 


OS 


Mus musculus (Mouse) . 




CARBOHYD 


94 


94 N-LINKED (GLCNAC. , .) (POTENTIAL). 


OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 


FT 


CARBOHYD 


299: 


299 N- LINKED (GLCNAC. . .) (POTENTIAL). 


OC 


Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 


FT 


CARBOHYD 


■ 318. 


318 N-LINKED (GLCNAC. . .) (POTENTIAL). 


RN 


[1] 


FT 


CARBOHYD 


478 


478 N-LINKED (GLCNAC. . .) (POTENTIAL). 


RP 


SEQUENCE FROM N.A, 


FT 


CARBOHYD 


628' 


628 N-LINKED (GLCNAC. . .) (POTENTIAL). 


RC 


STRAIN-BALB/C; TISSUE-BRAIN; 


FT 


CARBOHYD 


702 


702 N-LINKED (GLCNAC. . .) (POTENTIAL). 


RX 


MEDLINE-96112625; PubMed-8570174; 


• FT 


VARSPLIC 


819- 


838 MISSING (IN EMBRYONIC ISOFORM). 


RA 


Cooper H.M., Armes P., Britto J., Gad J., Kllks a.f.j 

"Cloning of the mouse homologue of the deleted in colorectal cancer 


so. 


SEQUENCE 


1447. AA; 158298 MW; 0D1F1097C22D5B9F CRC64; 


RT 










RT 


gene (mDCC) and its expression in the developing mouse embryo."; 










RL 


Oncogene 11:2243-2254(1995). 


Query Match 




7.64; Score 552.5; DB 1; Length 1447; 


RN 


[2] 


Best Local Similarity 22.01; Pred. No. 3.1e-22; 


RP 


REVISIONS. 


Matches 316; Conservative 172; Mismatches 532; Indels 415; Gaps 66; 


RC 


STRAIN-BALB/C; TISSUE-BRAIN; 










RA 


Cooper H.M.; 


Qy 


7 IEHPMDTTVPKNDPFTFNCQAEGN-PTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFLKV 65 


RL 


Submitted ( JON-1996) to the EMBL/GenBank/DDBJ databases. 


: 


1 : 


II II : 1 1:1 III 1 : II 1 1 : 


CC 


•!■ FUNCTION: IMPLICATED AS A TUMOR SUPPRESSOR GENE. 


Db 


43 VSEPSDAVTMF.GGNVLLNCSAESDRGVPVIKWKKDGLILALGMDDRKQQLPNGSLLIQNI 102 


cc 


•!• SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN, 










CC 


-!- ALTERNATIVE PRODUCTS: TWO FORMS OF THE PROTEIN ARE PRODUCED FROM 


Qy 


66 IHSI 


-RESDAGTYWCEAK-NEFGVARSRNATLQVA-VLRDEFRLEPANTRVAQGEVALME 122 


CC 


THE SAME GENE BY THE USE OF ALTERNATIVE INITIATION SITES. A THIRD 






: hi 1 Ml : 1 II 1 : II II 1 : : 1: h: 
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103 LHSRHHRPDEGLYQCEASLADSGSIISRTARVTVAGPLR- -FLSQTESITAFMGDTVLLK 160 

•123 CGAPRGSPEPQISWRKNGQTLN-LVGNRRIRIVDGGNLAIQEARQSDDGRYQCWKNWG 181 

! Ill H:|l I II I I: I: :: I I I : I I |:| :| 
161 CEV-IGEPMPTIHHQKNQQDLNPLPGDSRVWLPSGALQISRLQPGDSGVYRCSARNPAS 219 

182 TRESATAFLKV HVRPFLIRGPONQIAWGSSWPQCRIGGDPLPDVLWRRTASG 235 

I I ::: I : : : : M I : I I : I : I I I I I 
220 IRTGNEAEVRILSDPGLHRQLYFLQRPSNVIAIEGRDAVLECCVSGYPPPSFTWLRGEEV 279 

236 GNMPLRRFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHA 295 

: :hl :| :l : :H :l I III hi: III 

280 IQLRSKKYS LLGGSNLLISNVTDDDSGTYTCWTYKNENISASAELTVLV 329 

296 PPKFVIRPKNQLVEIGDEVLFECQANGHPRPILYWSVEGNSSLLLPGYRDGRMEVTLTPE 355 

III: II :: III :| I ||: I |: :::| :: 
330 PPWFLNHPSNLYAYESMDIEFECAVSGKPVPIVNWMKNGD-WIP— SDYFQIV— ■ 380 

356 GRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTLPV 415 

I I I I : I I II I I: I I :| I I I : II 

381 GGSNLRILGWKSDEG-FYQCVAENEAGNAQS SAQLIVPRPAI—PSSSILPS 430 

416 KSIWLPCRTLGTPVPQVSWYLDGIPIDV QEHERR — NLSDAGA" 457 

III : : ::|| I : :| : | | : |: 

431 APRDVLPV-LVSSRFVRLSW- ■ -RPPAEARGNIQTFTVFFSREGDNRERALNTTQPGSLQ 486 

458 LTISDLQRHEDEGLYT- -CVASNRNGRSSWSGYLRLDTPTNPNIRFFRAPELSTYPGPPG 515 

II: :l : I :ll II I I 
4 87 LTVGNL- —RPEAMYTFRWAYNEWGPGE- ■ 



:: II III III 
-SSQPIKVATQPELQV-PGPVE 532 



516 RPQMVERGENSVTLSWTRSNKVGGSSLVGYVIEMF GRNEIDGWVAVGTRVQNTT 569 

I I: ::| I : II :| II : I : 

533 NLHAVSTSPTS ILITWEPPAYANG ■ PVQGY • -RLFCTEVSTGREQ NIEVDGLS 582 

570 FTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGDW 629 

: II , I I I :| I :: III I |: I :| 
583 YKLEGLKKFTEYTLRFLAYNRYG—PGVSTDDITWTL SDVPSAPPQNIS — 630 

630 ELSNASWDSTSMKLTW QIINGKYVEGFYV 659 

11:1 l:|::| II :: |: : 

631 — -LEWNSRSIKVSWLPPPSGTQNG-FITGYKIRHRKTTMGEMETLEPNNLWYLFTG 685 

660 YARQLPNPIVNNPAP— -VTSNT--NPL-— LGSISTSASASASASALIST- 702 

1:1:111 |: I I I : :| : :| : 
686 LEKGSOYSFQVSAMTVNGIGPPSNWYTAETPENDLDESQVPDQPSSLHVRPQTNCIIMSW 745 

703 — KPNIAAAGKRDG ET NQSGGG 722 

III I I II 1:11 

746 TPPLNPNIWRGYIIGYGVGSPYAETVRVDSKQRYYSIERLESSSHYVISLKAFNNAGEG 805 

723 AP 724 

I 

806 VPLYESATTRSITDPTDPVDYYPLLDDFPTSGPDVSTPMLPPVGVQAVALTHEAVRVSWA 865 

725 — TPLNTR— YRMLTI--LNGGGAS SCTITGLVQYTLYEFFIVPFYK 765 

I I I I: I: II II III 1:111 :: 

866 DNSVPRNQRTSDVRLYTVRWRTSFSASAKYRSEDTTSLSYTATGLRPNTMYEFSVMVTKN 925 

766 SVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSS - - AVFLKWRAPELKDRHGVLLNYHVI 823 

I : I I I I: II : : II : h I : :| : I : 
926 RRSSTWSMTAHATTYEAAPTSAPRDLTVITREGRPRAVIVSWQPP-LEANGKITAYIL- 982 

824 VRGIDTAHNFSRILTNV TIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPY 875 

I : I: II I : :h II : I I II 

983 FYTLDRNIPIDDWIMETISGDRLTHQIMDLSLDTMYYFRIQARNVKGVGPL 1033 

876 CVP — ATLRLDPITKR LDPFINQRDHVND 902 

I II::: I :| : I :|: 

1034 SDPILFRTLRVEHPDRMANDQGRHGDGGYWPVDTNLIDRSTLNEPPIGQMHPPHGSVTPQ 1093 

903 VLTQPWFIILLGAILAVIilLSFGAMVFVKRRHMMMKOSALNTMRGNHTSDVLKM 956 

1:1 :: :| :| l|:: |:: :| :: I |: 
1094 RNSNLLVIT— •WTVG-VLTVLVWIVAVICTRRSSAQQRRK RATHSG 1138 



Qy 957 PSLSARNGN GYWLDSSTGGM- - VWRPS - ■ - PGGDSLEMQKDHIADYAPVCGAP 1004 

Ml: I: | : :|: | | :| II : 

Db 1139 - - -SRRKGSQKDLRPPDLWIHHEEMEMXNIEKPTGTDPAGRDSPIQS- -CQDLTPVSHSO 1193 

Qy 1005 GSPAGGGTSSGGSG GAGSGASGGDDIHGGHGSERNQQRYVGEYSNIPTDYAEVSSF 1060 

11:11 III I : : I : I : :: III: 
Db 1194 SETQMGSRSASHSGQDTEDAGSSMSTLERSLMRRAIRAKLMIPMEAQS--SNPAWSAI 1251 

Qy 1061 GKAPSEYGRHGNASPAPYATSSILSPHQQQQQQQPRYQQRPVP GYGLQRPMH 1112 

I hi : II h: llll hi I 

Ob 1252 PVPTLESAQYPGILPSP — TCGYPH PQFTLRPVPFPTLSVDRGFGAGR— 1297 

Qy 1113 PHYQQQQHQQQQAQQTHQQHQALQQHQQLPPSNIYQQMSTTSEIYPTNTGPSRSV 1167 

I: : II llh I :|l h I h I 
Db 1298 - - -TQSVSEGPTTQQQPMLPPA- ■ -QPEHPSSEEAPSRTIPTACV 1336 



DCCJUMAN • 

ID DCCJUMAN ■ STANDARD; PRT; 1447 AA, 

AC P43146; 

DT 01-NOV-1995 (Rel. 32, Created) 

DT 01-NOV-1995 (Rel. 32, Last sequence update) 

DT 15-JUL-1999 (Rel, 38, Last annotation update) 

DE TUMOR SUPPRESSOR PROTEIN DCC PRECURSOR (COLORECTAL CANCER SUPPRESSOR) . 

GN DCC. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-95011532; PubMed-7926722; 

RA Hedrick L., Cho K,R., Fearon E.R,, Wu T.-C, Rinzler R.W., 

RA Vogelstein B.; 

RT "The DCC gene product in cellular differentiation and colorectal 

RT tumorigenesis."; 

RL Genes Dev. 8:1174-1183(1994), 

RN [2] 

RP SEQUENCE OF 1-750 FROM N.A, 

RX MEDLINE-90100559; PubMed-2294591; 

RA Fearon E.R., cho K.R., Nigro J.M., Rem S.E., Simons J.W., 

RA Ruppert J.M., Hamilton S.R., Preisinger A.C., Thomas G., Rinzler K.w., 

RA Vogelstein B.; V 

RT "Identification '.of a chromosome lBq gene that is altered in 

RT colorectal cancers,"; 

RL Science 247:49-56(1990). 

RN [3] 

RP SEQUENCE OF 1075-472 FROM N,A, (SCRAMBELD EXONS). 

RX MEDLINE-91121517; PubMed-1991322; 

RA Nigro J.M., Cho ; R,R., Fearon E.R., Rern S.E., Ruppert J.M., 

RA Oliner J.D., Rinzler K.W., Vogelstein B.; 

RT "Scrambled exons."; 

RL Cell 64:607-613(1991). 

RN [4] r 

RP GENE STRUCTURE, AND VARIANTS CARCINOMA HIS-1375 . 

RX MEDLINE-94245241; PubMed-8188295; 

RA Cho K.R., Oliner J.D,, Simons J.W., Hedrick l, Fearon E.R., 

RA Preisinger A.C., Hedge P., Silverman G,A,, Vogelstein B.; 

RT "The DCC gene: structural analysis and mutations in colorectal 

RT carcinomas."; . 

RL Genomics 19:525-531(1994). 

RN [5] 

RP VARIANT CARCINOMA THR-168, AND VARIANT GLY-201. 

RX MEDLINE-94243823; PubMed-8187090 ; 

RA Miyake S., Nagai K., Yoshino K., Oto M., Endo M,, Yuasa Y.; 

RT "Point mutations and allelic deletion of tumor suppressor gene DCC in 

RT human esophageal squamous cell carcinomas and their relation to 

RT metastasis."; - 

RL Cancer Res, 54:3007-3010(1994). 

CC -!- FUNCTION: IMPLICATED AS A TUMOR SUPPRESSOR GENE. 

CC -!- SUBCELLULARS LOCATION: TYPE I MEMBRANE PROTEIN. 

CC -!- TISSUE SPECIFICITY: FOUND IN AXONS OF THE CENTRAL AND PERIPHERAL 
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cc 


NERVOUS SYSTEM AND IN DIFFERENTIATED CELL TYPES OF THE INTESTINE. 


cc 


•!• DISEASE: COLORECTAL 1 


UMC 


RS THAT LOST THEIR CAPACITY TO 


cc 


DIFFERE 


NT I A1 


E INTO MUCUS PRODUCING CELLS UNIFORMLY LACK DCC 


cc 


EXPRESSION. 


INACTIVATION OF DCC DUE TO ALLELIC DELETION AND/OR 


cc 


POINT MUTATIONS MAY CAUSE BOTH LYMPHATIC AND HEMATOGENOUS 


cc 


METASTASIS C 


F OESOPHi 


\GEi 


Uj SQUAMOUS CELL CARCINOMAS, 


cc 


-,!' SIMILARITY: CONTAINS 4 IMMUNOGLQBULIN-LIKE C2-TYPE DOMAINS. 


cc 
cc 
cc 


■!■ SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 

J*. ^ 


This SWISS- 


PROT 


entry is copyright. It is produced through a collaboration 


cc 


between th 


e Swiss Institute of Bioinformatics and the EMBL outstation - 


cc 


the European Bioinformatics Institute. There are no restrictions on its 


cc 


use by non-profit institutions as long as its content is in no way 


cc 


modified an 


d this statement 


is not removed. Usage by and for commercial 


cc 


entities requires a license 'agreement (See http://ww.isb-sib.ch/announce/ 


cc 
cc 

DR 


or send an email to licenseSisb-sib.ch). 


EMBL; X76132; CAA53735.1 






DR 


EMBL; M32292; AAA35751.1 






DR 


EMBL; M32286; AAA52174.1 






DR 


EMBL; M32288; AAA52175.1 


ALT.SEQ. 


jR 


EMBL; M32290; AAA52176.1 






\ 


EMBL; M63696; AAA52177.1 






DR 
DR 


EMBL; M63700; AAA52178.1 
EMBL; M63702; AAA52179.1 






DR 


EMBL; M63718; AAA52180.1 






DR 


EMBL; M63698; AAA52181.1 






DR 


PIR; A54100 


■ A54100. 






DR 


PIR; A40098 


• A40098. 






DR 


PIR; A38442 


• A38442. 






DR 


HSSP; P56276; 1TLR. 






DR 


HIM; 120470; -. 








DR 


INTERPRO; IPR001777; -. 






DR 


INTERPRO; IPR003006; -. 






DR 


PFAM; PF00041; fn3; 6. 






DR 


PFAM; PF00047; ig; 4. 


i 




DR 


PRINTS; PR00014; FMTYPEIII.: 




KW 


Glycoprotein; Immunoglobuli 


domain; Transmembrane; Signal; 


KW 


Anti -oncogene; I 


isease mutation; Polymorphism. 


FT 


SIGNAL 


1 


25 




POTENTIAL. 


FT 


CHAIN 


26 


1447 




TUMOR SUPPRESSOR PROTEIN DCC. 


FT 


DOMAIN 


26 


1097 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1098 


1122 




POTENTIAL. 


FT 


DOMAIN 


1123 


1447 




CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


54 


124 




IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


154 


219 




IG-LIRE C2-TYPE DOMAIN. 


FT 


DOMAIN 


254 


317 




IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


345 


407 




IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


426 


522 




FIBRONECTIN TYPE-III. 


m 


DOMAIN 


525 


618 




FIBRONECTIN TYPE-III. 


II 


DOMAIN 


619 


716 




FIBRONECTIN TYPE-III. 


PFT 


DOMAIN 


722 


816 




FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


840 


940 




FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


941 


1042 




FIBRONECTIN TYPE-III. 


FT 


DISULFID 


61 


117 




BY SIMILARITY. 


FT 


DISDLFID 


161 


212 




BY SIMILARITY. 


FT 


DISULFID 


261 


310 




BY SIMILARITY, 


FT 


DISULFID 


352 


400 




BY SIMILARITY. 


FT 


CARBOHYD 


94 


94 




N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


299 


299 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


318 


318 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


478 


478 




N-LINKED (GLCNAC. , .) (POTENTIAL), 


FT 


CARBOHYD 


628 


628 




N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


702 


702 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


VARIANT 


168 


168 




M -> T (IN OESOPHAGEAL CARCINOMA). 


FT 










/FTId-VARj03909. 


FT 


VARIANT 


201 


201 




R -> G. 


FT 










/FTId-VAR_003910. 


FT 


VARIANT 


1375 


1375 




P -> H (IN A COLORECTAL CARCINOMA). 


FT 










/FTId-VAR_003911, 


FT 


CONFLICT 


138 


138 




MISSING (IN REF. 3). 


FT 


CONFLICT 


233 


329 




MISSING (IN REF. 3). 


FT 


CONFLICT 


421 


421 




MISSING (IN REF, 3), 



SQ SEQUENCE 1447 AA; 158456 MW; 4A8612766ED0471F CRC64; 



Query Match 7.6%; Score 551,5; DB 1; Length 1447; 

Best Local Similarity 21.8%; Pred. No. 3.5e-22; 

Matches 312; Conservative 177; Mismatches 542; Indels 397; Gaps 

3y 5 RIIEHPMDTTVPKNDPFTFNCQAEGN-PTPTIQWFRDGRELKTDTGSHRIMLPAGGLFFL 63 

1:11 : :| II ; MM III I : I I I 
Db 41 RFLSEPSDAVMGGNVLLDCSAESDRGVPVIKWKKDGIHIiALGMDERKQQLSNGSLLIQ 100 

2y 64 KVIHSR-RESDAGTYWCEAK-NEFGVARSRNATLQVA-VLRDEFRLEPANTRVAQGEVAL 120 

::IH :-l I I 1 1 1 : I 1 1 I : I II I I : : |: I 
Db 101 NILHSRHHKPDEGLYQCEASLGDSGSIISRTAKVAVAGPLR-FLSQTESVTAFMGDTVL 158 

2y 121 MECGAPRGSPEPQISWRKNGQTLN-LVGNKRIRIVDGGNLAIQEARQSDDGRYQCVVKNV 179 

::| I .11 I hll I I : I: 1= :: I I I : M |:| :| 
Db 159 LKCEV-IGEPMPTIHWQKNQQDLTPIPGDSRWVLPSGALQISRLQPGDIGIYRCSARNP 217 

3y 180 VGTRESATAFLKV HVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTA 233 

:| I'::: | : : :: | | |: I I :| : I I I II 
Db 218 ASSRTGNEAEVRILSDPGLHRQLYFLQRPSNWAIEGKDAVLECCVSGYPPPSFTWLRGE 277 

Dy 234 SGGNMPLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTV 293 

: :hi :| :| : : 1 1 • : I I III hi: III 

Db 278 EVIQLRSRKYS LLGGSNLLISNVTDDDSGMYTCWTYKNENISASAELTV 327 

3y 294 HAPPRFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLT 353 

II I: M III :| I II: I |: :::| :: 

Db 328 LVPPWFLNHPSNLYAYESMDIEFECTVSGKPVPTVNWMKNGD-WIP--SDYFQIV-- 380 

3y 354 PEGRSVLSIARFAREDSGRWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTL 413 

I I I I : I I lllh: I :| I I : 

Db 381 - -GGSNLRILGWKSDEG ■ FYQCVAENEAGNAQT SAQLIVPKPAIPSSSVLPSA 431 



Qy 

Db 

Qy 
Db 

Qy 
Db 
Qy 
Db 



414 PVKSIWLPCRTLGTPVPQVSWYLDGIPIDV QEHERR — NLSDAGA 457 

I : II , : : ::|| I : :| : | I : |: 

432 PRDWPVL VSSRFVRLSW---RPPAEAKGNIQTFTVFFSREGDNRERALNTTQPGS 484 

458 - -LTISDLQRHEDEGLYT- -CVASNRNGKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGP 513 

II: : : I :|| I I I :: II III III 

485 LQLTVGNL--KPEAMYTFRWAYNEWGPGE SSQPIKVATQPELQV-PGP 530 

514 PGKPQMVEKGENSVTLSWTRSNKVGGSSLVGYVIEMF GKNETDGWVAVGTRVQN 567 

II I: ::| I : II :| II : I 
531 VENLQAVSTSPTSILITWEPPAYANG-PVQGY-RLFCTEVSTGKEQ NIEVDG 580 

568 TTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGD 627 

:: II I I I :l I :: III I II: :: 

581 LSYKLEGLKKFTEYSLRFLAYNRYG- ■ -PGVSTDDITWT LSDVPSAPPQNV 629 

628 WELSNASWDSTSMKLTW QIINGKYVEGFYV 659 

:| IN I:|::| II :: h : 

630 SLE WHSRSIKVSWLPPPSGTQNG-FITGYKIRHRKTTRRGEMETLEPNNLWYLF 683 

660 YARQLPNPIVNNPAP VTSNT - -NPL LGSTSTSASASASASALIS 701 

I; hill I: I II : : : :| 

684 TGLERGSQYSFQVSAMTVNGTGPPSNWYTAETPENDLDESQVPDQPSSLHVRPQTNCIIM 743 

702 T KPNIAAAGKRDG ET NQSG 720 

: III I I II I :l 

744 SWTPPLNPNIWRGYIIGYGVGSPYAETVRVDSKQRYYSIERLESSSHYVISLKAFNNAG 803 



721 GGAP TPL 727 

I I II: 
804 EGVPLYESATTRSITDPTDPVDYYPLLDDFPTSVPDLSTPMLPPVGVQAVALTHDAVRVS 863 

728 NTKYRMLTI--LNGGGAS SCTITGLVQYTLYEFFIVPF 763 

'< :: I: hll II III hill :: 

854 WADNSVPKNQKTSEVRLYTVRWRTSFSASAKYKSEDTTSLSYTATGLKPNTMYEFSVMVT 923 

764 YKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSS- -AVFLKWKAPELKDRHGVLLNYH 821 

I I' Nihil : 1 1 : h I : : I : I 
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Db 924 KNRRSSTWSMTAHATTYEAAPTSAPKDFTVITREGKPRAVIVSWQPP-LEANGKITAYI 981 

Qy 822 VIVRG IDTAHNFSRILTNV TIDAASPTLVLANLTEGVMYTVGVAAGNNAGVG 873 

fc no : I : h II I : :| II : I |: III 

Db 982 L FYTLDKNIPIDDWIMETISGDRLTHQIMDLNLDTMYYFRIQARNSKGVG 1031 

Qy 874 PYCVP— ATLRLD PITKRL DPFINQRDHVNDVLT 905 

I I II::: I: I :| I I : :| 

Db 1032 PLSDPILFRTLKVEHPDRMANDQGRHGDGGYWPVDTNLIDRSTLNEPPIGQMHPPHGSVT 1091 

Qy 906 QP WFIILLGAILAVLMLSFGAMVFVKRKHMMMRQSALNTMRGNH-TSDVLKMP 957 



i * • - li:: I- :| | : |: | 

Db 1092 PQKNSNLLVIIWTVGVITVLVWIVAVICIRRSSAQQRKKRATHSAGKRKGSQKDLRPP 1151 

Qy 958 SLSARNGNGYWLDSSTGGM • - VWRPS - - - PGGDSLEMQKDHI ADYAPVCGAPGSPAGGGT 1012 

I I' Ill: I 

Db 1152 DL WIHHEEMEMKNIEKPSGTDPAGRDSPIQS-CQDLTPVSHSQSETQLGSK 1201 

•1013 SSGGSG — GAGSGASGGDDIHGGHGSERNQQRY-VGEYSNIPTDYAEVSSFGKAPSEY 1067 
I: II III I : : I : : II I I II: I 
Db 1202 STSHSGQDTEEAGSSMSTLERSLAARRAPRAKLMIPMDAQSNNP - - - AWSAIPVPTLES 1258 

Qy 1068 GRHGNASPAPYATSSILSPHQQQQQQQPRYQQRPVP GYGLQRPMHPHYQQQQ 1119 

" 1 = 1 = II I" Mil |:| I 
Db 1259 AQYPGILPSP — TCGYPH PQFTLRPVPFPTLSVDRGFGAGR 1297 

Qy 1120 HQQQQAQQTHQQHQALQQHQQLPPSNIYQQMSTTSEIYPTNTGPSRSV 1167 

I: : II Mil I ::|| |: I |: I 
Db 1298 SQSVSEGPTTQQPPMLPPS— QPEHSSSEEAPSRTIPTACV 1336 



RESULT 9 
AXOIJUMAN 

ID AXOIJUMAN STANDARD; PRT; 1040 AA. 

AC Q02246; j 

DT 01-JUL-1993 (Rel. 26, Created) 

DT 01-JUL-1993 (Rel. 26, Last sequence update) 

. DT 15-JUL-1999 (Rel. 38, Last annotation update) 

DE AXONIN-1 PRECURSOR (AXONAL GLYCOPROTEIN TAG-1) (TRANSIENT AXONAL 

•I DE GLYCOPROTEIN 1). 

' GN TAXI OR TAG1. 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

rn (1] 

•SEQUENCE FROM N.A* 
TISSUE-BRAIN; 
MEDLINE=93145965; PubMed-8425542; 

RA Hasler T.H., Rader C, Stoeckli E.T., Zuellig R.A., Sonderegger P.; 

RT "cDNA cloning, structural features, and eucaryotic expression of 

RT human TAG-l/axonin-1."; 

RL Eur. J. Biochem. 211:329-339(1993). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=BRAIN; 

RX MEDLINE-94140354; PubMed=8307567; 

RA Tsiotra CP., Karagogeos D., Theodorakis K., Michaelidis M.T., 

RA Modi U.S., Fur ley J. A., Jessel M.T., Papamatheakis J.; 

RT "Isolation of the cDNA and chromosomal localization of the gene 

RT (TAXI) encoding the human axonal glycoprotein TAG-1."; 

RL Genomics 18:562-567(1993) . 

CC -!- FUNCTION: MAY PLAY A ROLE IN THE INITIAL GROWTH AND GUIDANCE OF 

CC AXONS. MAY BE INVOLVED IN CELL ADHESION, . 

CC -I- SUBCELLULAR LOCATION: ATTACHED TO THE NEURONAL MEMBRANE BY A 

CC GPI -ANCHOR AND IS ALSO RELEASED FROM NEURONS. 

CC -!- SIMILARITY; CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC •!- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 



CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email .to license@isb-sib.ch). 

CC 

DR EMBL; X68274; CAA48335.1; -. 

DR EMBL; X67734; CAA47963.1; -. 

DR PIR; S28830; S28830. 

DR MIM; 190197; -. 

DR INTERPRO; IPR0017.77; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 4. 

DR PFAM; PF00047; ig; 6. 

KW Immunoglobulin domain; Glycoprotein; Signal; GPI-anchor; 

KW Cell adhesion; Repeat. 



t3 


SIGNAL 


1' 


, 28 




FT 


CHAIN 


29 


1012 


AXONIN-1. 


FT 


PROPEP 


1013 


1 1040 


REMOVED IN MATURE FORM (POTENTIAL) , 


FT 


DOMAIN 


54 


' 118 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


148' 


. 216 


IG-LIKE C2-TYPE DOMAIN. 


FT 




254 


' 313 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


341' 


402 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


433 


495 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


523 


594 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


606 


* 612 


GLY/PRO-RICH, 


FT 


DOMAIN 


611 


' 706 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


714 


809 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


816 


; 908 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


917 


1003 


FIBRONECTIN TYPE- III. 


FT 


SITE 


794 


: 796 


CELL ATTACHMENT SITE (BY SIMILARITY) 


FT 


CARBOHYD 


76 


- 76 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


198' 


198 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


204 


- 204 


N-LINKED (GLCNAC. . ,) (POTENTIAL). 


FT 


CARBOHYD 


461 


461 


N- LINKED (GLCNAC, . .) (POTENTIAL), 


FT 


CARBOHYD 


477' 


' 477 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


498 


, 498 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


525 


■ 525 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


830' 


. 830 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CARBOHYD 


918 


918 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CARBOHYD 


940 


■ 940 


N-LINKED (GLCNAC. , .) (POTENTIAL), 


FT 


LIPID 


1012 


1012 


GPI-ANCHOR (POTENTIAL) , 


SQ 


SEQUENCE 


1040 


AA; 113393 MW; 254E78DD3C28EFB6 CRC64; 



Query Match 7.5%; Score 544,5; DB 1; Length 1040; 

Best Local Similarity 21.8%; Pred. No. 5,3e-22; 

Matches 253; Conservative 162; Mismatches 408; Indels 335; Gaps 49; 

Qy 4 PRIIEHPMDTTVPK--NDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGL 60 



Db 37 PVFEDQPLSVLFPEESTEEQVLLACRARASPPATYRWKMNGTEMKLEPGS-RHQLVGGNL 95 

Qy 61 FFLKVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVAL 120 

::: Oil I I I I I 1 1 I I : I : : : I : :| : 
Db 96 V- - - IMNPTKAQDAGVYQCLASNPVGTWSREAILRFGFLQEFSKEERDPVKAHEGWGVM 152 

Qy 121 MECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVD-GGNLAIQEARQSDDGRYQCWKN 178 

: I I !•• I I : I : I III I Mill:: 
Db 153 LPCNPPAHY PGLSY RWLiLN- EFP NF I PTDGRHFVSQT TGNLYI ART NASDLGNYSCLATS 211 

Qy 179 W--GTRESATAFLKVHV RPFL IRGPQNQTAWGSSWFQCRIGGDPLPDV 227 

: I: : I :::: II II hll I :| |:|:| : 
Db 212 HMDFSTKSVFSKFAQLNLAAEDTRLFAPSIKARFPAETYALVGQQVTLECFAFGNPVPRI 271 

Qy 228 LWRRTASGGNMPLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITA 287 

II: I I I : :h: Mill l||:|: I I 

Db 272 KWRK-VDGSLSP— -QW TTAEPTLQIPSVSFEDEGTYECEAENSKGRDTV 317 

( 

Qy 288 TGILTVHAPPXFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGR 347 

I : I I : =11 : : I I I MM: II II 
Db 318 QGRIIVQAQPEWLKVISDTEADIGSNLRWGCAAAGKPRPTVRWLRNGE PLASQNR 372 

Qy 348 MEVTLTPEGR5VLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQG 407 

:H ■ I :: : Nil : I I I |:: I : : I 
Db 373 VEVL A'JDLRFSKLSLEDSG -MYQCVAENKHGT I YASAELAVQALAPDFRLN 422 
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Qy 408 PVNQTLPVK--SIWLPCRTLGTPVPQVSW?LDGIPIDVQEHERRNLSDAGALTISDLQR 465 

II = :l -HI: I I I I I I I :: I II :: I 
Db 423 PVRRLIPAAKGGEILIPCQPRMPKAWLW-SKGTEILVNS-SRVTVTPDGTLIIRNISR 480 

Oy 466 HEDEGLYTCVASNRNGKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKGEN 525 

III III I I II:: :| I : I I II : : |:| 

Db 481 - SDEGKYTCFAENFMGKANSTGILSVRDAT KITLAPSSAD INLGDN 525 

Qy 526 SVTLSWTRS NKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQ 566 

:| :ll :l II 

Db 526 LTLCCHASHDPTMDLTFTWTLDDFPIDFDKPGGH 559 

Qy 567 NTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLL" 624 

: :| : : :: |: II : |:: : :| : |::| 

Db 560 • • - YRRTNVKETIGDLTILNAQLRHG -GKYTCMAQTV VDSASKEATVLVR 605 

Qy 625 SGDWELSNASWDSTSMKLTWQ ■ ■ I INGKYVEGFYVYARQLP NPIVNNPA 673 

I II : l:::|:| I : : : II I : III 
Db 606 GPPGPPGGW VRDIGDTTIQLSWSRGFDNHSPIAKYTLQARTPPAGKWKQVRTNPA 661 

A 674 PVTSN- -TNPLLG STSTSASASASASALISTK- --PNIAAAGKRDGE 715 

W : I I :ll I : I: I |: |::| :| 

^b 662 NIEGNABTAQVLGLTPWMDYEFRVIASNILGTGEPSGPSSKIRTREAAPSVAPSGL — 717 

Qy 716 TNQSGGGAP TPLNTKYRMLTILNGGGA SSCTITG 749 

Mill, II:: :|: II I : : I 

Db 718 - -SGGGGAP,GELIVNWTPMSREYQ NGDGFGYLLSFRRQGSTHWQTARVPGADAQY 770 

Qy 750 LVQYTLYEFFIVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVF 802 

: II :| I : : :| I : : : I: I \\\ : I ::|| : 
Db 771 FVYSNESVRPYTPFEVKIRSYNRRGDGPESLTALVYSAEEEPRVAPTKVWAKGVSSSEMN 830 

Qy 803 LKWKAPELKDRHGVLLNY HVIVRGIDTAHNFSRILTNVTIDAASP 847 

: I: I :l :hll I I l:||: I : I 
Db 831 VTWE - PVQQDMNG ILLGYEIRYWKAGDKEAAADRVRT AGLDTSARVSGLHPN 881 



f 



■-ITRR 889 



Qy 848 TLVLANLTEGVMYTVGVAAGNNAGVGPYCVPA— TLRLDP 

I I I I I II II I I:: I :: : 

Db 882 TKYHVTVRAYNRAGTGPASPSANATTMKPPPRRPPGNISWTFSSSSLSIK 931 

Qy 890 UJPFINQRDHVNDVLTQPWFIILWAIIAVLMLSFGAMVFVKRKHMMMKQSALNTMRGNH 949 

II : I: :!l: : : 

Db 932 WDPWPFRN ESAVTGYKMLY 951 

Qy 950 TSDVLKMPSLSARNGNGYWLD SSTGGMVWRPSPGGDSLEMQKDHIA 995 

:h 1:1 I I:: :|| llll : : II 

Db 952 QNDLHLTPTLHLTGKN--WIEIPVPEDIGHALVQIRTTG PGGDGIPAEV-HIV 1001 



*996 DYAPVCGAPGSPAGGGTS 1013 
*llll 

1002 RHGGTS 1007 



RESULT 10 
AXOIJAT 

ID AXOlJAT STANDARD; PRT; 1040 AA. 

AC P22063; 

DT 01-AUG-1991 (Rel. 19, Created) 

DT 01-AUG-1991 (Rel, 19, Last sequence update) 

DT 15-JOL-1999 (Rel. 38, Last annotation update) 

DE AXONIN-1 PRECURSOR (AXONAL GLYCOPROTEIN TAG'l) . 

GN TAXI. 

OS Rattus norvegicus (Rat). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrate; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

RN [1] 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 31-41, ■ 

RC TISSUE=SPINAL CORD; 

RX MEDLINE-90199890; PubMed-2317872; 

RA Fur ley A.J., Morton S.B., Manalo D., Karagogeos D., Dodd J., 

RA Jessell T.M.; 

RT "The axonal glycoprotein TAG-1 is an immunoglobulin superfamily 



RT member with neurite outgrowth -promoting activity/; 

RL Cell 61:157-170(1990). 

CC -!- FUNCTION: MAY PLAY A ROLE IN THE INITIAL GROWTH AND GUIDANCE OF 

CC AXONS. MAY BE INVOLVED IN CELL ADHESION. 

CC -I- SUBCELLULAR LOCATION : ATTACHED TO THE NEURONAL MEMBRANE BY A 

CC GPI -ANCHOR AND IS ALSO RELEASED FROM NEURONS. 

CC -I- TISSUE SPECIFICITY: IN NEURAL TISSUES IN EMBRYOS, AND IN ADULT 

CC BRAIN, SPINAL CORD AND CEREBELLUM. 

CC -I- DEVELOPMENTAL STAGE: TRANSIENTLY EXPRESSED ON A SUBSET OF AXONS 
CC IN THE DEVELOPING RAT NERVOUS SYSTEM. 

CC •!- SIMILARITY: 'CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC -!- SIMILARITY: ; CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC - 

CC This SWISS-PROT. entry is copyright. It is produced through a collaborator 

CC between the Swiss Institute of Bioinformatics and the EMBL outsta L .i:r - 

CC the European Bioinformatics Institute. There are no restrictions on ^ 

CC use by non-profit institutions as long as its content is in no ^i) 

CC modified and this statement is not removed. Usage by and for commerckj 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to licenseSisb-sib.ch). 

cc 

DR EMBL; M31725; AAA42201.1; 

DR PIR; A34695; A34695. 

DR INTERPRO; IPRQ01777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 4. 

DR PFAM; PF00047; ig; 6. 

KW Immunoglobulin domain; Glycoprotein; Signal; GPI-anchor; 

KW Cell adhesion; Repeat. 



FT 


SIGNAL 


1- 


30 




FT 


CHAIN 


31 


71015 


AXONIN-1. 


FT 


PROPEP 


?1016 


1040 


REMOVED IN MATURE FORM. 


FT 


DOMAIN 


56! 


120 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


150 


218 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


256. 


315 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


343 


404 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


435 


497 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


525 


596 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


608' 


614 


GLY/PRO-RICH. 


FT 


DOMAIN 


613. 


708 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


716 


811 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


818' 


910 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


911' 


1005 


FIBRONECTIN TYPE-III. 


FT 


SITE 


796' 


798 


CELL ATTACHMENT SITE (POTENTIAL). 


FT 


CARBOHYD 


78' 


78 


N- LINKED (GLCNAC. , .) (POTENTIAL) . 


FT 


CARBOHYD 


200 


200 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


206! 


206 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


463, 


463 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


479 : 


479 


N-LINKED (GLCNAC, , .) (POTENTIAL). 


FT 


CARBOHYD 


500 


500 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


527 


527 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


777 


777 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


832, 


832 


N-LINKED (GLCNAC, , .) (POTENTIAL). 


FT 


CARBOHYD 


920 


920 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CARBOHYD 


942 ■ 


942 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


SQ 


SEQUENCE 


1040. 


M; 113042 MW; 6E707EF6614CB4FB CRC64; 



Query Match 7.4%; Score 540.5; DB 1; Length 1040; 

Best Local Similarity 22.8%; Pred. No. 8.7e-22; 

Matches 261; Conservative 160; Mismatches 442; Indels 283; Gaps 51; 

Qy 4 PRI IEHPMDTTVPK - - -NDPFTFNCQAEGNPTPT IQWFKDGRELKTDTGS - HRIMLPAGG 59 

I I I: 'I: II 1:1 :| I :| :| :: : || |::| I 
Db 39 PIFEEQPIGLLFPEESAEDQVTLACRARASPPATYRWKMNGTDMNLEPGSRHQLM--GG 95 

Qy 60 LFFLKVIHSRRESDAGTYWCEMNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVA 119 

I : III I I I I I I: I I: h: : : :| 
Db 96 - • NLVIMSPTKTQDAGVYQCLASNPVGTWSKEAVLRFGFLQEFSKEERDPVKTHEGWGV 153 

Qy 120 LMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVD--GGNLAIQEARQSDDGRYQCWK 177 

:: I I I. I I : I : I III I HIM: 
Db 154 MLPCNPPAHYPGLSYRWLLN-EFPNFIPTDGRHFVSQTTGNLYIARTNASDLGNYSCLAT 212 



Best Available Copy 
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178 NW--GTRESATAFLKVHVR---PFLI RGPQNQTAWGSSWFQCRIGGDPLPD 226 

: : I: =1 :::: II, II |:|| I :| |:|:| 
213 SHMDFSTKSVFSKFAQLNLAAEDPRLFAPSIRARFPPETYALVGQOVTLECFAFGNPVPR 272 

227 VLWRRTASGGNMPLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCFJtfNAVGGIT 286 

Ml: I I I ' : :|:: |: || | | III:!: I I 
273 IKWRK-VDGSLSP— -QW AIAEPILQIPSVSFEDEGTYECEAENSKGRDT 318 

287 ATGILTVHAPPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDG 346 

I : I I I::: : :|| : : I II III : I I I 
319 VOGRIIVQAQPEWLKVISDTEADIGSNLRWGCAAAGKPRPMVRWLRNGE PLASQN 373 

' 347 RMEVTLTPEGRSVLSIARFAREDSGKVVTCNALNAVGSVSSRTVVSVDTQFELPPPIIEQ 406 

Ml H : Mil : I I I I:: I : : I I 
374 RVEVL AGDLRFSKLSLEDSG -MYQCVAENKHGTI — YASAELAVQALAPDFRQ 423 



407 GPVNQTLPVK- ■ -SIWLPCRTLGTI 

II : :l I :| I 



I III III I I II:: :| 
481 SR- SDEGKYTCFAENFMGKANSTGIlfcVRDAT ■ 



QVSWYLDG I P IDVQEHERRNLSDAG ALT I S DL 463 
III: I :: I I I 



424 NPVRRLIPAARGGEISIL-CQPRAAI5ATILW-SKGTEI-LGNSTRVTVTSDGTLIIRNI 480 
464 QRHEDEGLYTCVASNRNGKSSWSGYI JLDTPTNPNIKFFRAPELST YPG PPGKPQMVEKG 523 

■-INVG 525 



I I II 
KITLAPSSAD-- 



524 EN SVTLSWTRS NKVGGSSLVGYVIEMFGKNETDGWVAVGTR 564 

'•I :| :ll T :| II II I : : 

526 DNLTLQCHASHDPTMDLTFTWTLDDFPIDFDKPGGH YRRASAKET I6DLT I — 576 

565 VQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLL 624 

: I II : h: : II |: |: 

577 LNAHVRHG -GKYTCMAQTWDGT SKEATVLV 606 

625 SGDWELSNASWD ■ - STSMKLTWQ ■ • I INGKYVEGFYVYARQLPN PIVNNPAPVT 676 

I I I h::hl I : : : III: : II : 

607 RGPPGPPGGVWRDIGDTIVQLSWSRGFDNHSPIAKYTLQARTPPSGKWKQVRTNPVNIE 666 

677 SN- -TNPLLG STSTSASASASASALISTK- "PNIAAAGKRDGETNQ 718 

I I :M I : I: I II |::| :| 

667 GNAETAQVLGLMPWMDYEFRVSASNILGTGEPSGPSSKIRTKEAVPSVAPSGL S 720 

719 SGGGAP TPLNTKYR MLTILNGGGAS- -SCTITG L 750 

lllll II:: :|: :|: I :| : : I : 
721 GGGGAPGELIINWTPVSREYQNGDGFGYLLSFRRQGSSSWQTARVPGADAQYFVYGNDSI 780 

751 VQYTLYEFFIVPFYKSVEGKPSNSRIARTLEDVPSFAPYGMEALLLNSSAVFLKWKAPEL 810 

II :l I : : :| I : : : I: I II : I :|| : : h I I 
781 QPYTPFEVKIRSYNRRGDGPESLTALVYSAEEEPRVAPAKVWAKGSSSSEMNVSWE-PVL 839 

811 KDRHGVLLNY HVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLT 855 

:| :|:M I I |:||: : 
840 QDMNGILLGYEIRYWKAGDNEAAADRVRTAGLDTSAR VTGLN 881 

856 EGVMYTVGVAAGNNAGVGPYCVPATLRLDPITKRLDPFINQRDHVNDVLTQPWFIILLGA 915 

I I I I I II I II: I :| : :| | 

882 PNTKYHVTVRAYNRAGTG PASPSADAMTVKPPP RRPPGNISWTF 925 

916 ILAVLMLSFGAMVFVKRKHMMMKQSALNTMRGNHTSDVLKMPSLSARNGNGYWLD 970 

: I I : :| : :| : : : :|: |:| : | |:: 
925 SSSSLSLKWDPW PLRNESTVTGYKMLYQNDLHPT PTLHLTSKN - - WIE IPVPE 977 

971 SSTGGMVWRPSPGGDSLEMQKDHIADYAPVCGAPGSPAGGGTSSGGSGGAG 1021 

:|l Mil : : II III I 

978 DIGHALVQIRTTG PGGDGIPAEV-HIV RNGGTSMMVESAAA 1017 



Qy 1022 SGASGG 1027 
I I 

Db 1018 RPAHPG 1023 



RESULT 11 
DSCAJUMAN 
ID DSCAJUMAN 



STANDARD; 



AC 060469; 060468; 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rel. 40, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE DOWN SYNDROME CELL ADHESION MOLECULE PRECURSOR (CHD2) . 

GN DSCAM. 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 
RN [1] 

RP SEQUENCE FROM N.A., AND ALTERNATIVE SPLICING. 
RC TISSUE-BRAIN; 

RX MEDLINE-98087574; PubMed-9426258; 

RA Yamakawa K., Huot Y.-K,, Haendelt M.A., Hubert R. , Chen X.-N,, 
RA Lyons G.E., Korenberg J.R.; 

RT "DSCAM: a novel member of the immunoglobulin superfamily maps in a 

RT Down syndrome region and is involved in the development of the 

RT nervous system."; 

RL HIM, Mol. Genet, 7:227-237(1998). 

RN [2] 

RP SEQUENCE FROM N,A, 

RX MEDLINE-20289799; PubMed-10830953; 

RA Hattori M., Fujiyama A., Taylor T.D., Watanabe H, , Yada T., 

RA Park H.-S., Toy'oda A,, Ishii K. , Totoki Y., Choi D.-K., Soeda E. , 

RA Ohki M,, Takagi T. , Sakaki Y., Taudien S. ( Blechschmidt K, , Polley A,, 

RA Menzel D., Delabar J,, Kurapf K, , Lehmann R., Patterson D., 

RA Reichwald K., Rump A., Schillhabel M., Schudy A., Zimmermann w., 

RA Rosenthal A., Kudoh J., Shibuya K. , Kawasaki K. , Asakawa S., 

RA Shintani A., Sasaki T. ( Nagamine K,, Mitsuyama S., Antonarakis S.E., 

RA Minoshima S.', Shimizu N., Nordsiek G,, Hornischer K., Brandt P., 

RA Scharfe M., Schpen O., Desario A,', Reichelt J., Kauer G., Bloecker H., 

RA Ramser J., Beck A,, Klages S., Hennig S., Riesselmann L,, Dagand E., 

RA Wehrmeyer S,, Borzp L, Gardiner K., Nizetic D., Francis p., 

RA Lehrach H., Reinhardt R. , Yaspo M.-L.; 

RT "The DNA sequence of human chromosome 21. \ 

RL Nature 405:311-319(2000). 

RN [3] 

RP FUNCTION. 

RA Agarwala K.L., Nakamura S., Tsutsumi Y., Yamakawa K. ; 
RT "Down syndrome 'cell adhesion molecule DSCAM mediates homophilic 
RT intercellular adhesion."; 
RL Brain Res. Mol: Brain Res. 79:118-126(2000). 

-!- FUNCTION: CELL ADHESION MOLECULE THAT CAN MEDIATE CATION- 
INDEPENDENT HOMOPHILIC BINDING ACTIVITY. COULD BE INVOLVED IN 
NERVOUS SYSTEM DEVELOPMENT. 
■!• SUBCELLULAR' LOCATION; TYPE I MEMBRANE PROTEIN (PROBABLE). THE 

SHORT ISOFORM MAY BE SECRETED. 
-!- ALTERNATIVE PRODUCTS: 2 ISOFORMS; A LONG FORM/CHD2-52 (SHOWN HERE) 

AND A SHORT FORM/CHD2-42; ARE PRODUCED BY ALTERNATIVE SPLICING. 
-!- TISSUE SPECIFICITY: PRIMARILY EXPRESSED IN BRAIN. 
■!■ SIMILARITY: CONTAINS 10 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
-!- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no ^\ 
modified and this statement is not removed. Usage by and for cohere: j! 
entities requires a license agreement (See http://www.isb-sib.ch/anric;:^f/ 
or send an email to licenseSisb-sib.ch), 

EMBL; AF023450; AAC17967.1; ■ 
EMBL; AF023449,-: AAC17966.1; ■ 
EMBL; AL163283;- CAB90464.1; • 
EMBL; AL163282; CAB90436.1; -, 
EMBL; AL163281; CAB90444.1; -, 
HIM; 602523; -.' 
INTERPRO; IPR00.1777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 6. 
PFAM; PF00047;'ig; 9, 
PRINTS; PR00014'; FNTYPEIII. 

Immunoglobulin 'domain; Glycoprotein; Signal; Cell adhesion; Repeat; 
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KW Transmembrane; Alternative splicing. 



SIGNAL 
CHAIN 
DOMAIN 
TRANSMEM 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DISULPID 
DISULFID 
DISOLFID 
DISULFID 
DISOLFID 
DISULFID 
DISULFID 
DISULFID 
DISULFID 
DISULFID 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 



CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
VARSPLIC 



1 



17 



18 2012 

18 1595 

1596 1616 

1617 2012 



39 
138 
239 
328 
421 
518 
610 
704 
802 
885 



109 
204 
300 
392 
491 
582 
675 
773 
872 
972 



984 1076 

1088 1177 

1189 1273 

1300 1366 

1380 #1463 

1477 1562 



102 
197 
293 
385 
484 
575 
669 
766 
809 865 
1307 1359 



46 
145 
246 
335 
428 
525 
617 
711 



28 
78 
470 
487 
512 
556 
658 
666 
710 
748 
795 
924 



28 
78 
470 
487 
512 
556 
658 
666 
710 
748 
795 
924 



1142 1142 

1160 1160 

1250 1250 

1271 1271 

1341 1341 

1488 1488 



POTENTIAL, 
DOWN SYNDROME CELL ADHESION MOLECULE. 
EXTRACELLULAR (POTENTIAL). 
POTENTIAL. 

CYTOPLASMIC (POTENTIAL). 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 

ig-like c2-type domain, 
ig-like, c2-type domain, 
ig-like c2-type domain, 
ig-like c2-type domain, 
ig-like c2-type domain, 
ig-like c2-type domain. 
fibrone:tin type-iii. 
fibrone 1 :tin type-iii. 
fibronectin type-iii. 
fibronectin type-iii. 
ig-likej c2-type domain, 
fibronectin type-iii. 
fibronectin type-iii. 
by similarity, 
by similarity, 
by similarity, 
by similarity, 
by similarity, 
by similarity, 
by similarity, 
by similarity. 

Bt SIMILARITY'. 
BY SIMILARITY. 

N-jLINKED (GLCNAC. . .) (POTENTIAL) . 

^LINKED (GLCNAC. . .) (POTENTIAL). 

N-LINKED (GLCNAC. . .) (POTENTIAL). 

N-LINKED (GLCNAC. . .) (POTENTIAL). 

N-LINKED (GLCNAC. . .) (POTENTIAL), 

N-LINKED (GLCNAC.. . .) (POTENTIAL). 

^LINKED (GLCNAC. . .) (POTENTIAL). 

N-LINKED (GLCNAC. . .) (POTENTIAL). 

N-LINKED (GLCNAC. . .) (POTENTIAL). 

N-LINKED (GLCNAC. . .) (POTENTIAL). 

N-LINKED (GLCNAC. . .) (POTENTIAL). 

N-LINKED (GLCNAC. . .) (POTENTIAL) . 

N-LINKED (GLCNAC. . .) (POTENTIAL). 

N: LINKED (GLCNAC. . ,) (POTENTIAL). 

N' LINKED (GLCNAC. . ,) (POTENTIAL). 

N LINKED (GLCNAC. . .) (POTENTIAL). 

N LINKED (GLCNAC. . .) (POTENTIAL). 

) (POTENTIAL). 



1571 NFATLNYDGS •> KEAARCKEFS (IN SHORT 
ISOFORM), 

VARSPLIC 1572 2012 MISSING (IN SHORT ISOFORM). 

CONFLICT 1893 2012 HRPGDLIHLPPYLRMDFLLNRGGPGTSRDLSLGQACLEPQK 
SRTLKRPTVLEPIPMEAASSASSTREGQSWQPGAVATLPQR 
EGAELGQAAKMSSSQESLLDSRGHLKGNNPYAKSYTLV •> 
IGQVTSYICLHTLEWTFC (IN REF. 1). 

SEQUENCE 2012 AA; 222259 MW; 0E33CFB781A08334 CRC64; 



Query Match 7.4%; Score 538; DB 1; Length 2012; 

Best Local Similarity 21.7%; Pred. No. 2.9e-21; 

Matches 286; Conservative 166; Mismatches 458; Indels 410; Gaps 55; 

Qy 25 CQAEGNPTPTIQWFKD GRELKTDTGSHRIMLPAGGLFFLKVIHSRRESDAGTY 77 

!:! hi I :! I! II II II :| : I l|:|:| 

Db 246 CKALGHPEPDYRWLKDNMPLELSGRFQKTVTG LLIENIRPSDSGSY 291 

Qy 78 WCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVALKECGAPRGSPEPQISWR 137 

II I :| I: I: : I : : I : I I: : ::|| 

Db 292 VCEVSNRYGTAKVIGRLYVKQPLK-ATISPRKVKSSVGSQVSLSCSV-TGTEDQELSWY 348 



Qy 138 KNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWK" 



:lh H ill :: II : :ll I III I: 
349 RNGEILNPGKNVRITGINHENLIMDHMVKSDGGAYQCFVRKDKLSAQDYVQWLEDGTPK 408 



178 ■ 



--NWGT-* 

II II 



Db 409 IISAFSEKWSPAEPVSLMCNVKGTPLPTITWTLDDDPILKGGSHRISQMITSEGNWSY 468 

Qy 183 f RESATAFL— KVHVR-PFLIRGPQNQTAWGSSWFQCRIG 220 

II I :::ll I II :| II: I II: 
Db 469 LNISSSQVRDGGVYRCTANNSAGWLYQARINVRGPASIRPMKNITAIAGRDTYIHCRVI 528 

Qy 221 GDPLPDVLWRRTAS 234 

II : I'! :: 

Db 529 GYPYYSIKWYKNSNLLPFNHRQVAFENNGTLKLSDVQKEVDEGEYTCNVLVQPQLSTSQS 588 

Qy 235 GGNMPLRKFSWLHSA— SGRVHVLEDR 259 

V |::|: :| I : I I 

Db 589 VHVTVKVPPFIQPFEFPRFSIGQRVFIPCVWSGDLPI-TITWQKDGRPIPGSLGVTIDN 647 

Qy 260 SLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPKFVIRPKNQLVEIGDEV 314 

|::':::| Mil I I : II l!lll::|::| I I 
Db 648 IDFTSSLRIS1ILSLMHNGNYTCIARNEAAAVEHQSQLIVRVPPKFWQPRDQDGIYGKAV 707 

Qy 315 LFECQANGHPRPTLYWSVEGNSSL--LLPGYRDGRMEVTLTPEGRSVLSIARFAREDSGK 372 

: I I |:|> II: I : : I :||::| I I Mil 

Db 708 ILNCSAEGYPVPTIVWKFSKGAGVPQFQPIALNGRIQVL — SNGSLLIKHWEEDSGY 763 

Qy 373 WTCNALNAVGS-VSSRTWSVDTQFELPPPIIEQGPVNQTLPVK-SIWLPCRTLGTPV 430 

: I I r r : II ::l I :| M II : : I I 

Db 764 YL-CKVSNDVGADVSKSMYLTVKI PAMITSYP-NTTLATQGQKKEMSCTAHGEKP 816 

Qy 431 PQVSWYLDGIPIDVQEHERR NLSDAGALTI SDLQ — RHEDEGLYTC V 475 

II Ml : : I Mil II I ::| 

Db 817 IIVRW * — EKEDRIINPEMARYLVSTKEVGEEVISTLQILPTVREDSGFFSCH 866 

Qy 476 ASNRN(3KSSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKG"ENSVTLSWTR 533 

II |: .1 ::| 'I. II I:: I II 
Db 867 AINSYGED-RGIIQL TVQEPPDPPEIEIKDVKARTITLRWTM 907 

Qy 534 SNKVGGSSLVGYVIEMFGKNETDGWVA VGTRVQNTTFTQTGLLPGVNYFFLIRA 587 

I I : II II lh:| I : I :: M : I I M 
Db 908 GFD-GNSPITGYDIEC-KNKSDSWDSAQRTKDVSPQLNSATIID-IHPSSTYSIRMYA 962 

Qy 588 ENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKLTWQ 647 

: Mil :|: I : I |: : : I |:::||: 

Db 963 KNRIGRSEP--SNELTI TADEAAPDGPPQEV-HLEPISSQSIRVTWK 1006 

Qy 648 IINGKYVEGFYVYARQLP— NPIVNNPAPVTS NTNP LLGS 685 

: II ': I: : I: I I : II II I: 

Db 1007 APKKHLQNG-IIRGYQIGYREYSTGGNFQFNIISVDTSGDSEVYTLDNLNKFTQYGLWQ 1065 

Qy 686 TSTSASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLT - - ILNG — 739 

I I: :|:l : I I: :| :: : h III 

Db 1066 ACNRAGTGPSSQEIITT--TLEDVPSYPPENVQAIATSPESISISWSTLSKEALNGILQG 1123 

Qy 740 GGASSCTIT GLVQYTLYEFFIVPFYKSVEGKPSNSRIAR 778 

' I M I II Ml I ::| :: :l I I 
Db 1124 FRVIYWANLMDGELGEIKNITTTQPSLELDGLEKYTNYSIQVLAFTRAGDGVRSEQIFTR 1183 

Qy 779 TLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDRHGVLLNYHVIVRGIDTAHNFSRILT 838 

I MM M::| ::| 11= I I II :|:: II =1 : 
Db 1184 TKEDVPG • PPAGVKAAAASASMVFVSW - LPPLK - LNGI IRKYTVF CSHPYPTVIS 1235 

Qy 839 NVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPITK— RLDPFIN 895 

I : MM hi I I Ml : ::|: I M I 

Db 1236 EFEASPDSFSYRIPNLSRNRQYSVWWAVTSAGRGN-SSEIITVEPLAKAPARILTF-- 1291 

Qy 896 QRDHVNDVLTQPWFIILLGAILAVLMLSFGAMVFVKRKHMMMKQSALNTMRGNHTSDVLK 955 

: MM : ::| I: III II:: 

Db 1292 SGTVTTPW MKDIVLPCKAVGDPSPAVKWMKDS NGTPSLVT 1331 



956 MPSLSARNGNGYWL-- 

: :||:: 



- -DSSTGGMVWRPSPGGDS- -LEMQKDHIADYAPVCGAPGS 1006 
II : : I I I : I I 
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Db 1332 IDGRRSIFSNGSFIIRTVKAEDSGYYSCIANNNWGSDEIILNLQVQVPPD QPRL 1385 

Qy 1007 PAGGGTSSGGSGGAGSGASGGDDIHGGHGSERNQQRYVGEYSNIPTDYAEVSSFGKAPSE 1066 

II! : I :|| I I I: :|| :: : II :||| 

Db 1386 TVSKTTSSSITLSWLPGDNGGSSIRG YILQYSEDNSE--QWGSFPISPSE 1433 



RESULT 12 
CONTJOMAN 

ID CONTJOMAN STANDARD; PRT; 1018 AA, 

AC Q12860; Q12861; 014030; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel, 35, Last sequence update) 

DT 01-OCT-2000 (Rel, 40, Last annotation update) 

DE CONTACTIN PRECURSOR (GLYCOPROTEIN GP135), 

GN CNTN1. 

OS Homo sapiens (Human) , 

I: Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

C Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

N [1] 

RP SEQUENCE FROM N.A., AND PARTIAL SEQUENCE, 

RC TISSUE-BRAIN; 

RX MEDLINE-95048335; PubMed-7959734; 

RA Berglund E.O., Ranscht B.; 

RT "Molecular cloning and in situ localization of the human contactin 

RT gene (CNTN1) on chromosome 12qll-ql2.\ 

RL Genomics 21:571-582(1994). 

RN [2] 

RP SEQUENCE FROM N.A., AND CHARACTERIZATION. 

RX MEDLINE-94217459; PubMed-8164510; 

ra Reid R.A., Hemperly J. J.; 

RT "Identification and characterization of the human cell adhesion 

RT molecule contactin,"; 

RL Brain Res. Mol. Brain Res. 21:1-8(1994). 

CC -!- FUNCTION: MEDIATES CELL SURFACE INTERACTIONS DURING NERVOUS 

CC SYSTEM DEVELOPMENT. 

CC -I- SUBCELLULAR LOCATION: ATTACHED TO THE MEMBRANE BY A GPI-ANCHOR. 

CC -!- ALTERNATIVE PRODUCTS: 2 ISOFORMS; 1 (SHOWN HERE) AND 2; ARE 
CC PRODUCED BY ALTERNATIVE SPLICING. 

CC -!- SIMILARITY: CONTAINS 6 IMMONOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS, 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation ■ 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 



EMBL; U07819; AAA67920 
EMBL; U07820; AAA67921, 
EMBL; Z21488; CAA79696, 
HSSP; P40189; 1BQU, 
MIM; 600016; -. 
INTERPRO; IPR001777; - 
fNTERPRO; IPR003006; - 
PFAM; PF00041; fn3; 4. 
PFAM; PF00047; if 6. 
Immunoglobulin domain 1 
Cell adhesion; Repeat, 



Glycoprotein; Signal; GPI -anchor; 
Alternative splicing. 



FT 


SIGNAL 


' 1 


20 




FT 


CHAIN 


21 


? 


CONTACTIN. 


FT 


PROPEP 


? 


1018 


REMOVED IN MATURE FORM. 


FT 


DOMAIN 


58 


121 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


151 


218 


IG-LIRE C2-TYPE DOMAIN. 


FT 


DOMAIN 


256 


317 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


345 


398 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


429 


491 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


519 


590 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


602 


609 


GLY/PRO-RICH. 


FT 


DOMAIN 


609 


710 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


711 


812 


FIBRONECTIN TYPE-III, 



FT 


DOMAIN 


813 


908 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


909. 


1004 


FIBRONECTIN TYPE-III. 




FT 


DISCLFID 


65 


114 


BY SIMILARITY. 




FT 


DISULFID 


158; , 


211 


BY SIMILARITY, 




FT 


DISULFID 


263 


310 


BY SIMILARITY. 




FT 


DISULFID 


352 


391 


BY SIMILARITY, 






AtortT crn 
DIsULr ID 


436 


484 


BY SIMILARITY. 




J 


DIoULFID 




583 


BY SIMILARITY. 






IAKUUHiD 


208 > 


208 


N-LINKED (GLCNAC. , .) 


(POTENTIAL) , 


FT 


CARBOHYD 


258 


258 


N-LINKED (GLCNAC, , .) 


(POTENTIAL). 


FT 


CARBOHYD 


338 ,' 


338 


N-LINKED (GLCNAC, , .) 


(POTENTIAL) . 


FT 


CARBOHYD 


457- 


457 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


473 


473 


N-LINKED (GLCNAC. , .) 


(POTENTIAL). 


FT 


CARBOHYD 


494 • 


494 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


521 . 


521 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


591; 


591 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


933 • 


933 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


VARSPLIC 


21 


31 


MISSING (IN ISOFORM 2) 




FT 


CONFLICT 


798 ' 


798 


V ■> L (IN REF. 2), 




SQ 


SEQUENCE 


1018 AA 


; 113320 MW; 4B8FDC5BFD434ED5 CRC64; 



Query Match . 7.4*; Score 536.5; DB 1; Length 1018; 

Best Local Similarity 22.8*; Pred. No. 1.4e-21; 

Matches 239; Conservative 132; Mismatches 371; Indels 307; Gaps 

Qy 1 GENPRIIEHPMDTTVPKND---PFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPA 57 

I I I l::l I: : Ihl :| I :| : :: I I I : 

Db 38 GFGPIFEEQPINTIYPEESLEGKVSLNCRARASPFPVYKWRMNNGDV-DLTSDRYSMVG 95 

Qy 58 GGLFFLKVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEP-ANTRVAQG 116 

II : : :: III hi I :|: II III I I I I II :l 

Db 96 GNLV— INNPDKQKDAGIYYCLASNNYGMVRSTEATLSFGYL-DPFPPEERPEVRVKEG 151 

Qy 117 EVALMECGAPRGSPEP-QISWRKNGQTLNLVGNKRIRIVD - -GGNLAIQEARQSDDGRYQ 173 

: :: I ! ' I: I I : : :|| I I III I II I I 
Db 152 KGMVLLCDPPYHFPDDLSYRWLLNEFPVFITMDKR-RFVSQTNGNLYIANVEASDKGNYS 210 

Qy 174 CWKNWGTRESATAFLKVHVRPFLIRGPQNQT AWGSSWFQCR 218 

II: h : |: II |: I |::| :| :| 

Db 211 CFVSSPSITK3VFSKFIP LIPIPERTTKPYPADIWQFKDVYALMGQNVTLECF 264 

Qy 219 IGGDPLPDVLWRR TASGG 236 

:|:M: I: : || 

Db 265 ALGNPVPDIRWRKVLEPMPSTAEISTSGAVLKIFNIQLEDEGIYECEAENIRGKDKHQAR 324 

Qy 237 NMPLRKFSWLHSASGRVHVLEDRSLKLDD 265 

I: II : : hi I 
Db 325 IYVQAFPEWVEHINDTEVDIGSDLYWPCVATGKPIPTIRWLKNG YAYHKGELRLYD 380 

Qy 266 VTLEDMGEYTCEADNAVGGITATG ILTVHA- PPKFVI RP - • KNQLVEIGDEVLFECQANG 322 

II I: I I M:| I I I I : I I I : I I I I h lh 
Db 381 VTFENAGMYQCIAENTYGAIYANAELKILALAPTFEMNPMKKKILAAKGGRVIIECKPKA 440 

Qy 323 HPRPTLYWSVEG NSSLLLPGYRDGRMEVTLTPEGRSVLSIARFAREDSGKWTCN 377 

1:1 i! :l III :l : II :!: Ill: II 

Db 441 APKPKFSWS-KGTEWLVNSSRILI-WEDGSLEIN NITRNDGG-IYTCF 485 

Qy 378 ALNAVGSVSSR-TWSVDTQFELPPPIIEQGPVNQTLPVKSIWLPCRTLGTPVPQVS" 434 

I I I :| - hi I I I hi : I : I I : : 
Db 486 AENNRGKANSTGTLVITD PTRIILAPINADITVGENATMQCAASFDPALDLTFV 539 

Qy 435 WYLDGIPIDVQE* ■ -HERRN- -LSDAGALTISDLQ-RHEDEGLYTCVASNRNGKSSWSGY 488 

:| II : Ml 111:1: III III 
Db 540 WSFNGYVIDFNKENIHYQRNFMLDSNGELLIRNAQLKH--AGRYTCTAQTIVDNSSASAD 597 

Qy 489 ■ LRLDTPTNPNI'KFFRAPELSTYPGPPGKPQMVEKGENSVTLSWTRSNKVGGSSLVGYVIE 548 

Ml ; Mil : II hhl : M I h 

Db 598 LWRGP — ■ PGPPGGLRIEDIRATSVALTWSRGSD-NHSPISKYTIQ 640 

Qy 549 M FGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSE 601 

: ; M : I hM I M I h I III 

Db 641 T KT I LS DDWKDAKTDPP I IEGNM - - -EAARAVDLIPWMEYEFRWATNTLGRGEPSIPSN 697 
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602 PITVGTRYFNSGLDLSEARASLLSGDVVELSNASVVDSTSMRLTWQIINGKYVEGFYVYA 661 

I I II : :":|| :| I 
698 RIKT DGAAPNVAPSDV — GGGGGRNRELTITWAPLSREYHYG 737 

662 RQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASM.ISTKPNIAAAGKRDGETNQSGG 721 

II ::: II III 
738 NN FGYIVAFKP FDGE 752 

722 GAPTPLNTKYRMLTILNGGGASSCTITG LVQYTLYEFFIVPFYRSVEGKPSN 773 

::: :|: I II : I :: : I :| I 

753 EWKKVTVTNPD TGRYVHRDETMSPSTAFQVKVKAFNNKGDGPYSL 797 

774 SRIARTLEDVPSEAPYGMEALLLNSSAVPLKMKAPELKDRHGVLLNYHVIVRGIDT — 829 

: : :l Mill : :|:|| : : |: II: : ::: 

798 VAVINSAQDAPSEAPTEVGVKVLSSSEISVHWE HVLEKIVESYQIR 843 

830 • ■ •AHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPI 886 

II: I : : : I II I : I I Ml II I: : :: 
844 YWAAHDKEEAANRVQVTSQEYSARLENLLPDTQYFIEVGACNSAGCGP— PSDM-IEAF 899 

887 TKRLDP FINQRDHV 900 

II: I :| III 

900 TRKAPPSQPPRIISSVRSGSRYIITWDHV 928 



RESULT 13 
NRG_DROKE 

ID NRGJ5ROME STANDARD; PRT; 1239 AA. 

AC P20241; Q24414; 

DT 01-FEB-1991 (ReL 17, Created) I) 

DT 01-FEB-1991 (Rel. 17, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEUROGLIAL PRECURSOR. 

GN NRG. 

OS Drosophila melanogaster {Fruit fly) , 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilldae; Drosophila, 

RN [1] 

RP SEQUENCE FROM H.A., AND SEQUENCE OF 24-41 AND 737-751. 

RX MEDLINE-90030418; PubMed-2805067; 

RA Bieber A.J., Snow P.M., HortschM., Patel N.H., Jacobs J.R., 

RA Traquina Z.R., Schilling J, ( Goodman C.S.; 

RT "Drosophila neuroglias a member of the immunoglobulin superfamily 

RT with extensive homology to the vertebrate neural adhesion molecule 

RT LI."; 

, RL Cell 59:447-460(1989). 

RN [2] 

SEQUENCE OF 1182-1239 FROM N.A. 
MEDLINE-90262720; PubMed-1693086; 
Hortsch M., Bieber A.J., Patel N.H., Goodman C.S.; 

RT "Differential splicing generates a nervous system-specific form of 

RT Drosophila neuroglian."; 

RL Neuron 4:697-709(1990). 

RN [3] 

RP X-RAY CRYSTALLOGRAPHY (2.0 ANGSTROMS) OF 610*814. 

RX MEDLINE-94213741; PubMed-7512815; 

RA Huber A.H., Wang Y.-M.E., Bieber A.J., Bjorkman P.J.; 

RT "Crystal structure of tandem type III fibronectin domains from 

RT Drosophila neuroglian at 2.0 A."; 

RL Neuron 12:717-731^994). 

CC ■!- FUNCTION: THIS PROTEIN MAY PLAY A ROLE IN NEURAL AND GLIAL CELL 

CC ADHESION IN THE DEVELOPING DROSOPHILA EMBRYO. 

CC -!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

CC ■!- ALTERNATIVE PRODUCTS: 2 ISOFORMS; A LONG FORM (SHOWN HERE) AND A 

CC SHORT FORM; ARE PRODUCED BY ALTERNATIVE SPLICING. 

CC -!- TISSUE SPECIFICITY; NEURONS AND GLIA IN THE DEVELOPING NERVOUS 

CC SYSTEM AND ON SOME OTHER NONNEURONAL TISSUES. 

CC •!- SIMILARITY: CONTAINS 6 IMMCNOGLOBULIN'LIKE C2-TYPE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III-LIRE DOMAINS. 

CC 

CC This SWISS-PROT entry is copyright, It is produced through a collaboration 



between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. -There are no restrictions on its 
use by nonprofit institutions as long as its content is in no way 
modified and this statement is not removed, Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 



EMBL; M28231; AAA28728.1; ALT SEQ. 
DR EMBL; X76243; CAA53822.1; -. 
DR PIR; A32579; A32579. 
DR PDB; 1CFB; 30-NOV-94. 
DR FLYBASE; FBgnOO02968; Nrg. 
DR INTERPRO; IPR001777; -. 
DR INTERPRO; IPR003006; *. 
DR PFAM; PF00041; fn3; 5. 
DR PFAM; PF00047; ig; 6, 

KW Cell adhesion; Glycoprotein; Transmembrane; Repeat; 3D-structure; 
Immunoglobulin domain; Signal; Embryo; Alternative splicing. 



FT 


SIGNAL 


1. 


23 






FT 


CHAIN 


24 


1239 


NEUROGLIAN. 




FT 


DOMAIN 


24; 


1138 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


1139 


1154 


POTENTIAL. 




FT 


DOMAIN 


1155 


1239 


CYTOPLASMIC (POTENTIAL), 


FT 


DOMAIN 


53, 


123 


IG-LIRE C2-TYPE DOMAIN 




FT 


DOMAIN 


149 


224 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


262' 


329 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


354 


422 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


447. 


512 


IG-LIRE C2-TYPE DOMAIN 




FT 


DOMAIN 


536 


606 


IG-LIRE C2-TYPE DOMAIN 




FT 


DOMAIN 


629 


690 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


729 


792 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


832 


896 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


932' 


997 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


1024, 


1098 


FIBRONECTIN TYPE-III. 




FT 


DISULFID 


59' 


111 


POTENTIAL. 




FT 


DISULFID 


625 


706 






FT 


CARBOHYD 


182 


182 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


198 


198 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


411' 


411 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


448 


448 


N-LINKED (GLCNAC. . ,) 


(POTENTIAL). 


FT 


CARBOHYD 


652' 


652 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


683. 


683 


N-LINRED (GLCNAC. , ,) 


(POTENTIAL) , 


FT 


CARBOHYD 


821' 


821 


N-LINRED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


1125 


1125 


N-LINRED (GLCNAC. . .) 


(POTENTIAL) , 


FT 


CONFLICT 


1234 


1234 


T -> Y (IN REF. 2). 




FT 


CONFLICT 


1237 


1237 


L -> R (IN REF, 2). 




SQ 


SEQUENCE 


1239 AA; 138284 MW; 49E12692D0DD027D CRC64 ; 



Query Match ■ 7,3%; Score 534; DB 1; Length 1239; 
Best Local Similarity 22.7%; Pred. No. 2.5e-21; 
" ' ' 239; Conservative 135; Mismatches 438; Indels 242; Gaps 42; 



Matches 


Qy 


4 


Db 


29 


Qy 


55 


Db 


89 


Qy 


113 


Db 


145 


Qy 


169 


Db 


205 


Qy 


212 


Db 


263 



III : I 



I I 
RQPGRGTL- 



■ -MDTTVPRNDPFTFNCQAEGNPTPTIQWFRDGRELRTDTGSHRIM 54 

::|| 1:1:1 I I I hi:: :|:: 



II :: I I I I I Mil II: :: I I : |: I I I 
:VITIPRDEDRGHYQCFASNEFGTATSNSVYVRKAEL-NAFRDEAARTLE 144 



:|| :::! II I I I ::l 



- - RKNGQTLNLVGNRRIRIVDGGNLAIQEARQSD ■ 168 

:: : I I: : III : I 



-DGRYQCWKNV" 
II: 



I II III :| 



■-ESATAFLKVHVRPFLIRGPQNQTAWGS 211 
II: II I : hi 



:: I 11:1 :|| : 
- -QRIQW- - - -SDRITQGHY- -GKSLVIRQTNF 309 
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0y 269 EDMGEYTCEADNAVGGITATG-ILTVHAPPKFVIRPRNQLVEIGDEVLFECOANGHPRPT 327 

: I III: | || : || |:: I I |: :||:|||:| | | 
Db 310 DDAGIYTCDVSNGVGNAQSFSIILNVNSVPYFTKEPEIATAAEDEEWFECRAAGVPEPK 369 

Qy 328 LYWSVEGNSSLLLPGYRDGRMEVTLTPEGR SVLSIARFAREDSGKWTCNALNAV 382 

M :|: III : : I M:| III M 

Db 370 ISA IHNGKPIEQSTPNPRRTVTDNTIRIINLVKGDTGN-YGCNATNSL 416 

Qy 383 GSVSSRTWSVDTQFELPPPIIEQGPVNQTLPVKSIWLPCRTLGTPVPQVSW YL 437 

II ::l : II I : 1 : I :| 

Db 417 GYVYKDVYLNVQAE PPTISEAPAAVSTVDGRNVTIKCRVNGSPKPLVKWLRASNWL 472 

Qy 438 DGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRNGKSSWSGYLRLDTPT" 495 

I I h MM: I I III I I: I: I I : I 

Db 473 TG GRYWQANGDLEIQDV-TPSDAGKYTCYAQNKFGEIQfiDGSLWKEHTRI 523 

Qy 496 --NPNIKFFRAPELSTY PGP — PGKPQMVERGENSVTLSW 531 

>l I : :|: I :|: I: :||:|:: 

524 TQEPQNYEVAAGQSATFRCNEAHDDTLEIEIDWWKDGQSIDFEAQPRFVKTNDNSLTIAK 583 

Qy 532 T RSN KVGG SS 541 

I hi I: I I 

Db ,584 IMELDSGEYTCVARTRLDEATARANLIVQDVPNAPKLTGITCQADKAEIHWEQQGDNRSP 643 

Qy 542 LVGYVIEMFGRiTDGWVAVGTRVQNTTFT-QTGLLPGVNYFFLIRAENSHGLSLPSPMS 600 

:: I I: II :| II : : I 1 1 I : I I I I 1 1 I 

Db 644 ILHYTIQFNTSFTPASWDAAYEKVPNTDSSFWQMSPWANYTFRVIAFNKIGASPPSAHS 703 

Qy 601 EPITVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKLTWQIINGKYVEGFYVY 660 

: I I: : I I :| : : :: I I : : 

Db 704 DSCTTQP DVPFKNPDNVVGQGTEPNNLVISWTPMPEIEHNAPNFHY—YVSW 753 

Qy 661 ARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTKPNIAAAGKRDGETN— 717 

I : llll: : I : Ml 1 1 : 1 
Db 754 KRDI PAAAWENNN- ■ IFDWRQNNIVIADQPTFVKYLIKVVAINDR-GESNVAA 803 

Qy 718 QSGGGAP--TPLNTKYRMLTILNGGGASSCTITGLVQYILYEFFIVPFYKSVEG- 769 

II I II M I : :: :: : : : || 
Db 804 EEWGYSGEDRPLDAPTNFTMRQITSSTSGYMAWTPVSEESVRGHFRGYRIQTWTENEGE 863 

Qy 770 RPSNSRIARTLE DVPSEAP — YGM 791 

II : II I ' I I I: 

Db 864 EGLREIHVKGDTHNALVTQFKPDSRNYARILAYNGRFNGPPSAVIDFDTPEGVPSPVQGL 923 

Qy 792 EALLLNSSAVFLKWRAPELKDRHGVLLNYHVIVRGIDTAHNFSRILTNVTI-DAASPTLV 850 
t :| llll III I :| I I : : :: I : II : 

■b 924 DAYPLGSSAFMLHWKKPLYP--NGRLTGYRIYYEEVKESYVGERREYDPHITDPRVTRMK 981 

Qy 851 IMjTEGVMYTVGVAAGNNAGVGP" YCVPATLR 882 

:M ' MM III II: 
Db 982 MAGLKPNSKYRI S ITATTKMGEGSEHY IEKTTLK 1015 



RESDLT 14 
CONTJOUSE 



PRT; 1020 AA. 



ID CONTJIOUSE STANDARD; 

AC P12960; 

DT 01-JAN-1990 (Rel. 13, Created) 

DT Ol-jAN-1990 (Rel. 13, Last sequence update) 

DT 15-JOL-1999 (Rel. 38, Last annotation update) 

DE CONTACT IN PRECURSOR (NEURAL CELL SURFACE PROTEIN F3). 

GN CNTN1 , 

OS Mus musculus (Mouse). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-C57BL/6; TISSUE=BRAIN; 

RX MEDLINE=89340657; PubMed-2474555; 

RA Gennarini G., Cibelli G., Rougon G., Mattel M.-G., Gorldis C; 

RT "The mouse neuronal cell surface protein F3: a phosphatidylinositol- 

RT anchored member of the immunoglobulin super family related to chicken 

RT contacting; 



J. Cell Biol. 109:775-788(1989). 

■I- FUNCTION: MEDIATES CELL SURFACE INTERACTIONS DURING NERVOUS 

SYSTEM DEVELOPMENT. 
-I- SUBCELLULAR' LOCATION: ATTACHED TO THE MEMBRANE BY A GPI-ANCHOR. 
-I- MISCELLANEOUS: F3 SHARES WITH Ll, N'CAM, MAG, AND OTHER CELL 

ADHESION MOLECULES FROM NERVOUS TISSUE THE L2/HNR-1 CARBOHYDRATE 

EPITOPE. 

-I- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN - LIKE C2-TYPE DOMAINS, 
■I- SIMILARITY CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; X14943; CAA33075.1; -. 
PIR; S05944; S05944. 
HSSP; P40189; 1BQU, 
MGD; MGI : 105980; CNTNl. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 4. 
PFAM; PF00047; ig; 6. 

Immunoglobulin domain; Glycoprotein; Signal; GPI-anchor; 
Cell adhesion; Repeat. 



SIGNAL 


1 


20 






CHAIN 
PROPEP 


21 

? • 


? 

1020 


CONTACTIN, 

REMOVED IN MATURE FORM 




DOMAIN 


58' 


121 


IG-LIRE C2-TYPE DOMAIN 




DOMAIN 


151'. 


218 


IG-LIKE C2-TYPE DOMAIN 




DOMAIN 


256 


317 


IG-LIKE C2-TYPE DOMAIN 




DOMAIN 


345 : 


398 


IG-LIRE C2-TYPE DOMAIN 




DOMAIN 


429. ' 


491 


IG-LIKE C2-TYPE DOMAIN 




DOMAIN 


519' 


592 


IG-LIKE C2-TYPE DOMAIN 




DOMAIN 


604' 


611 


GLY/PRO-RICH. 




DOMAIN 


611 


712 


FIBRONECTIN TYPE-III. 




DOMAIN 


713 


814 


FIBRONECTIN TYPE-III. 




DOMAIN' 


815- 


910 


FIBRONECTIN TYPE-III. 




DOMAIN 


911,. 


1006 


FIBRONECTIN TYPE-III. 




DISULFID 


65. 


114 


BY SIMILARITY. 




DISULFID 


158 


211 


BY SIMILARITY. 




DISULFID 


263 ; 


310 


BY SIMILARITY. 




DISULFID 


352 * 


391 


BY SIMILARITY. 




DISULFID 


436 


484 


BY SIMILARITY. 




DISULFID 


526 


585 


BY SIMILARITY. 




CARBOHYD 


208 


208 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


CARBOHYD 


258, 


258 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


CARBOHYD 


338 - 


338 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


CARBOHYD 


457. 


457 


N-LINKED (GLCNAC, . ,) 


(POTENTIAL) , 


CARBOHYD 


473,; 


473 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


CARBOHYD 


494 , 


494 


N-LINKED (GLCNAC. . ,) 


(POTENTIAL). 


CARBOHYD 


521, 


521 


N-LINKED (GLCNAC. . ,) 


(POTENTIAL). 


CARBOHYD 


593' 


593 


N-LINKED (GLCNAC, . ,) 


(POTENTIAL) . 


CARBOHYD 


935 , 


935 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 



SEQUENCE 1020' AA; 113388 MW; 9DCDAA40EAA4CBC7 CRC64; 



Query Match 7.3*; Score 531.5; DB 1; Length 1020; 

Best Local Similarity 22.4%; Pred. No. 2.6e-21; 

Matches 234; Conservative 152; Mismatches 361; Indels 299; Gaps 45; 

Qy 1 GENPRIIEHPMDTTVPKND - - - PFT FNCQAEGNPTPT IQWFKDGRELKTDTGS HRIMLPA 57 

I I I Ml |: : ||:| :|| :| : :: I M : 

Db 38 GFGPIFEEQPINTIYPEESLEGKVSLNCRARASPFPVYKWRMNNGDV--DLTNDRYSMVG 95 

Qy . 58 GGLFFLKVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEP-ANTRVAQG 115 

II : : :: III hi I I M II III MM :l :l 

Db 96 GNLV— INNPDKQKDAGVYYCLASNNYGMVRSTEATLSFGYL-DPFPPEERPEVKVKEG 151 

Qy 117 EVALMECGAP.SGSPEP-QISWRKNGQTLNLVGNRRIRIVD--GGNLAIQEARQSDDGRYQ 173 
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: :: I I I: I I : : :|| I I Ml I II I I 
Db 152 KGMVLLCDPPYHFPDDLSYRWLLNEFPVFITMDKR-RFVSQTNGNLYIANVESSDRGNYS 210 

Qy 174 CWRNVVGTRESATAFLRVHVRPFLIRGPQNQT AWGSSWFQCR 218 

I I : I: : I: II |: I "I :| :| 

Db 211 CFVSSPSITKSVFSKFIP LIPIPERTTKPYPADIWQFKDIYTMMGQNVTLECF 264 

Qy 219 IGGDPLPDVLWRETASGGNMPLRRPSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEA 278 

|:|:||: II: I: '.Mill II: III I I III 
Db 265 ALGNPVPDIRWRKVLE — PMPSTAEI - STSGAV LKIFNIQLEDEGLYECEA 312 

Qy 279 DNAVGGITATGILTVMPPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSL 338 

:| I : I I |::| : hi :: : I I I I ||: I II 
Db 313 ENIRGKDKHQARIYVQAFPEWVEHINDTEVDIGSDLYWPCIATGKPIPTIRWLKNGYS-- 370 

Qy 339 LLPGYRDGRM— EVTLTPEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRT—WS 392 

II: :ll I::l : I I II lh : ::: 

Db 371 — YHKGELRLYDVTF ENAG-MYQCIAENAYGSIYANAELKILA 410 

Qy 393 VDTQFELPPPIIEQGPVNQTLPVK-SIWLPCRTLGTPVPQVSW YLDGIPIDVQE 446 
: II: I : I I h: h I M II :| 

Kb 411 LAPTFEMNP MKKKILAAKGGRVI IECRPKAAPRPKFSWSKGTEWL VN 457 
y 447 HERRNLSDAGALTISDLQMEDEGLYTCVASNRNGRSSWSGYLRLDTPTNPNIRFFRAPE 506 

I : : hi I::: h I hill I I lh: :| I : II : II 
Db 458 SSRILI WEDGSLEINKITRN-DGG IYTCFAENNRGKANSTGTLVITNPT — RI ILAPI 512 

Qy 507 LSTYPGPPGKPQMVEKGENS VTLSWTRSNKVGGSSLVGYVIEMFGKN 553 

, : : III: :| I I 1111= I I 

Db 513 NAD ITVGENATMQCAASFDPALDLTFVW SFNGYVID-FNKE 552 

Qy 554 ETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSE 601 

: :| : ::| I : : I I : I 

Db 553 ITHIHYQRNFMLDANGELL — IRNAQLRHAGRYTCTAQTIVDNSSASADLWRGPPGP 608 

Qy 602 PITVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKLTWQIINGKYVEGFYVYA 661 

I II : : II II: III I 

Db 609 P GGLRIEDIRA TSVALTWS--RGS 630 

Qy 662 RQLPNPIVNNPAPVTSNT NPLLGSTSTSASA 692 

: :h: I |:: II I 

Db 631 DNHSPISRYTIQTRTILSDDWKDAKTDPPIIEGNMESAKAVDLIPWMEYEFR 682 

Qy 693 SASASALISTKPNIAAAG -KRDGET NQSGGGAPT PLNTKYR 732 

: : I : :hl : I II : III I lh :| 

Db 683 WATNTLGTGEPSIPSNRIKTDGAAPNVAPSDVGGGGGTNRELTITWAPLSREYHYGNNF 742 

Qy 733 MLTILNGGGASSCTITG LVQYTLYEFFIVPFYKSVEGRPSNSRI 776 

:| hi : I :: : I :| I : 

743 GYIVAFRPFDGEEWRRVTVTNPDTGRYVHRDETMTPSTAFQVKVRAFNNKGDGPYSLVAV 802 

B 777 ARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDRHGVLLNYHVIVRGIDT 829 

: :| Mill : :|:|| : : II lh : ::: 

Db 803 INSAQDAPSEAPTEVGVKVLSSSEISVHWK HVLEKIVESYQIRYWA 848 

Qy 830 AHNFSRILTNVTIDAASPTLVLANITEGVMYTVGVAAGNNAGVGPYCVPATLRLDPITRR 889 

h I : : : I II I : I I 1 : 1 1 I h: :: I" 
Db 849 GHDREAAAHRVQVTSQEYSARLENLLPDTQYFIEVGACNSAG CGPSSDVIETFTRR 904 

Qy 890 LDP — - FINQRDHV 900 

I :l III 

Db 905 APPSQPPRIISSVRSGSRYIITWDHV 930 



RESULT 15 
LAR.DROME 

ID LARJ5ROME STANDARD; PRT; 2029 AA, 
AC P16621; 

DT 01-ADG-1990 (Rel. 15, Created) 

dt 01-AUG-1990 (Rel. 15, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE PROTEIN-TYROSINE PHOSPHATASE DLAR PRECURSOR (EC 3.1.3.48) (PROTEIN- 

DE TYROSINE-PHOSPHATE PHOSPHOHYDROLASE) . 



IAR. 

Drosophila melanogaster (Fruit fly) . 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

Ephydroidea; Drosophilidae; Drosophila. 

[1] 

SEQUENCE FROM HiA. 

MEDLINE-90046860; PubMed-2554325; 

Streuli M., Krueger NX, Tsai A.Y.M., Saito H.; 

"A family of receptor-linked protein tyrosine phosphatases in humans 

and Drosophila."; 

Proc. Natl. Acad. Sci. U.S.A. 86:8698-8702(1989). 
[2] 

SEQUENCE FROM N'.A. 
STRAIN-CANTON- S; 

MEDLINE-96178473; PubMed-8598047; 

Krueger NX, van Vactor D., Wan H.I., Gelbart W.M., Goodman C.S., 
Saito H.; 

"The transmembrane tyrosine phosphatase DLAR controls motor axon 
guidance in Drosophila."; 
Cell 84:611-622(1996). 

-t- FUNCTION: IT IS POSSIBLE THAT DLAR IS A CELL ADHESION RECEPTOR . 

IT POSSESSES AN INTRINSIC PROTEIN TYROSINE PHOSPHATASE ACTIVITY 

(PTPASE). IT CONTROLS MOTOR AXON GUIDANCE, 
-t- CATALYTIC ACTIVITY: PROTEIN TYROSINE PHOSPHATE + H(2)0 ■ 

PROTEIN TYROSINE + ORTHOPHOSPHATE. 
-!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 
-!• TISSUE SPECIFICITY: SELECTIVELY EXPRESSED IN A SUBSET OF AXONS AND 

PIONEER NEURONS IN THE EMBRYO. 
■!- SIMILARITY: CONTAINS 3 IMMDNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
•!• SIMILARITY: CONTAINS 16 FIBRONECTIN TYPE III-LIKE DOMAINS. 
-I- SIMILARITY: CONTAINS 2 PROTEIN-TYROSINE PHOSPHATASE DOMAINS. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are ho restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Osage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; M27700; AAA28668.1; -. 

EMBL; U36857; AAC47002.1; -. 

EMBL; U36849; AAC47002.1; JOINED. 

EMBL; U36850; AAC47002.1; JOINED. 

EMBL; U36851; AAC47002.1; JOINED.. 

EMBL; U36852; AAC47Q02.1; JOINED, 

EMBL; U36853; AAC47002.1; JOINED. 

EMBL; U36854; AAC47002.1; JOINED. 

EMBL; U36855; AAC47002.1; JOINED. 

EMBL; U36856; AAC47002.1; JOINED. 

PIR; A36182; TDFFLK. 

HSSP; P28827; 1RPM, 

FLYBASE; FBgn0000464; Lar. 

INTERPRO; IPR000242; -. 

INTERPRO; IPR000387; -. 

INTERPRO; IPR001777; -. 

INTERPRO; IPS002006; -. 

PFAM; PF00102; Y_phosphatase; 2, 

PFAM; PF00041; fn3; 9. 

PFAM; PF00047; ig; 3. 

PRINTS; PR00014; FNTYPEIII. 

PRINTS; PR00700; PRTYPHPHTASE. 

PROSITE; PS00383; TYR_PHOSPHATASE_l; 2. 

PROSITE; PS50056; TYR_PHOSPHATASE_2; 2. 

PROSITE; PS50055; TYRJHOSPHATASEJTP; 2, 

Hydrolase; Receptor; Glycoprotein; Signal; Transmembrane; 

Cell adhesion; Immunoglobulin domain; Duplication. 



SIGNAL 
CHAIN 
DOMAIN 
TRANSMEM 
DOMAIN 



r 

33' 
33 ' 
1378 
1403 '■ 



32 
2029 
1377 
1402 
2029 



PROTEIN-TYROSINE PHOSPHATASE D 
EXTRACELLULAR (POTENTIAL), 
POTENTIAL. 

CYTOPLASMIC (POTENTIAL) . 
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FT 


DOMAIN 


50 


118 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


154 


216 


IG*LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


249 


308 


TG-T.TKF DOMAIN 
i\j ijiJXD \tC lira LMUnlii . 


FT 


DOMAIN 


320 


417 


FIBRONECTIN TYPE-III, 


FT 


DOMAIN 


418 


512 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


513 


607 


FTRRdNFPTTN TYPF-TTT 
r lorvvliEi^llii llr£j ill. 


FT 


DOMAIN 


608 


706 


FIBRONECTIN TYPE-III, 


FT 


DOMAIN 


707 


809 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


810 


906 


I IDnVliuv A 111 ilrfi 111. 


FT 


DOMAIN 


907 


1007 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


1008 


1102 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


1103 


1207 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


1492 


1738 


PROTEIN-TYROSINE PHOSPHATASE. 


FT 


DOMAIN 


1781 


2029 


PROTEIN-TYROSINE PHOSPHATASE. 


FT 


ACTJITE 


1670 


1670 


BY SIMILARITY. 


FT 


ACT.SITE 


1961 


1961 


BY SIMILARITY, 


FT 


DIStJLFID 


57 


111 


POTENTIAL. 


f T 


DISULFID 


161 


209 


POTENTIAL. 


\ 


DISDLFID 


256 


301 


POTENTIAL. 


It 


CARBOHYD 


176 


176 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


253 


253 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


298 


298 




FT 


CARBOHYD 


553 


553 


N-LINKED (GLCNAC. , ,) (POTENTIAL). 


FT 


CARBOHYD 


616 


616 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


666 


666 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CARBOHYD 


721 


721 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


774 


774 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


915 


915 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


962 


962 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


1183 


1183 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


1304 


1304 


N-LINKED (GLCNAC, . ,) (POTENTIAL). 


SQ 


SEQUENCE 


2029 AA; 229027 MW; 536AOC794D3DC800 CRC64; 



Query Match 7.2%; Score 521; DB 1; Length 2029; 

Best Local Similarity 24.0%; Pred. No. 2.4e-20; 

Matches 221; Conservative 131; Mismatches 359; Indels 210; Gaps 

Qy 4 PRIIEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFL 63 

Ml I : I :| I I I: |:| I |:|::: : :: ||: | 
Db 36 PEIIRKPQNQGVRVGGVASFYCAARGDPPPSIVWRKNGKKVSGTQSRYTVLEQPGGISIL 95 

Qy 64 KVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPA NTRVA 114 

" I I I I 1:1 I I I :||| : I I I 
Db 96 RIEPVRAGRDDAPYECVAENGVGDAVSADATLTIY — EGDKTPAGFPVITQGPGTRVI 151 

My 115 Q-GEVALMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQ 173 
F : I II M hi I I I II II : I I : I I I |: :|: | |:|: 

Db 152 EVGHTVLMTCKA-IGNPTPNIYWIKN-QTKVDMSNPRYSLKD-GFLQIENSREEDQGKYE 208 

Qy 174 CWKNWGTRESATAFLKVHVR— PFLIRGPQNQTAV-VGSSVVFQCRIGGDPLPDVLW 229 

II :l :il I I I II I I |: : I :||:: I I |:| I I 
Db 209 CVAENSMGTEHSKATNLYVKVRRVPPTFSRPPETISEVMLGSNLNLSCIAVGSPMPHVKW 268 

Qy 230 RRTASG — GNMPLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGI 285 

: : II: ■ II :||: : ::: III I : :| I 

Db 269 MKGSEDLTPENEMPI GR-NVLQ LINIQESANYTCIAASTLGQI 310 

Qy 286 TATGILTVHAPPKFVIRPKN-QLVEI-GDEVLFECQANGHPRPTLYWSVE 333 

: :: I : I I : I: |: II II |: :: 
Db 311 DSVSWKVQSLP— TAPTDVQI SEVTATSVRLEWSYKG - PEDLQYYVIQYKPKNANQAF 366 

Qy 334 GNSSLLLPGYRDGRMEVTLTPEGRSVLSIARFAREDSGKWTCNA LNAVGSVSSR 388 

I :: I I I h:: I II :| :| I 

Db ,367 SEISGIITMYYWRALSPYTEYEFYVIAVNNIGRGPPSAPATCTTGETKMESAPRNVQVR 426 

Qy 389 TWSVDTQFELPPPIIEQGPVNQTLPVKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHE 448 

I: I II : I I I I : I: II 
Db 427 TLSSSTMVITWEPP— ETPNGQVTGYKVYY TTNSNQPEASW H 466 

Qy 449 RRNLSDAGALTISDLQRHEDEGLYTCVASNRNGKSSWSGYLRLDTPTNPNIKFFRAP-EL 507 

: : :: |:||: I :|| :|: |: | :: 

Db 467 SQMVDNSELTTVSDVTPH- • -AIYT VRVQAYTSMGAGPMST PVQV 508 



508 STYPGPPGKP- - -QMVEKGENSVTLSWTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTR 564 

I I :h : : II :|ll lh ::| I |:: |:| I I 

509 KAQQGVPSQPSNFRATDIGETAVTLQWTKPTH-SSENIVHY--ELYW-NDTYANQAHHKR 564 

565 VQNT-TFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASL 623 

: I: :J II I I: : I : I :1 II I |: : I I 
565 ISNSEAYTLDGLYPDTLYYIWLAARSQRGEGATTP-- -PIPVRTKQYVPGAPPRNITAIA 621 

624 LSGDWELSNASWDSTSMKLTWQIINGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLL 683 

I . II:: hi II I: 

622 TS -----STTISLSW LPPPV 636 

684 GSTSTSASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGAS 743 

.1 :| I I: I I :|: : 
637 ERSNGRI I Y YKVFFVEVGREDDE AT TMTL NMT 668 

744 SCTITGLVQYTLYEFFIVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFL 803 

I = I Ml: ::: :| I: I II I II I : I ::| III::: : 
669 SIVLDELKRWTEYKIWVLAGTSVGDGPRSHPIILRTQEDVPGD-PQDVKATPLNSTSIHV 727 

804 KWKAPELKDRHGVLLNYHVIVRGI -DTAHNF SRILTNVTIDAASPTLVLA 852 

II I llhl:: II: : : I I : III 

728 SWKPPLEKDRNGIIRGYHIHAQELRDEGKGFLNEPFKFDWDTLEFNVT 776 

853 NLTEGVMYTVGVAAGNNAGVG 873 

I |::MII I I 
777 GLQPDTKYSIQVAALTRKGDG 797 



Search completed: January 22, 2001, 12:27:59 
Job time: 1180 sec 
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GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM protein ■ protein search, using sw model 



Run c 



Title: 

Perfect score: 
Sequence: 



January 22, 2001, 12:51:13 ; Search time 559.88 Seconds 
(without alignments) 
289.105 Million cell updates/sec 

US-09-540-245A-16 
7272 

1 GENPRIIEHPMDTTVPKNDP RSLLSNSGSGTSSQPAGHNV 1381 



Scoring table: BLQSUM62 

Gapop 10.0 , Gapext 0.5 



irched: 



374700 seqs, 117207915 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0* 
Maximum Match 100% 
Listing first 45 summaries 



SPTREMBL.15:* 

sp_archea:* 

spjacteria:* 

spjungi:* 

spjiuman:* 

sp^invertebrate:* 

spjtiammal:* 

spjihc:* 

sp.organelle:* 

sp_phage:* 
sp_plant:* 
sp_rodent:* 
sp.virus:* 
sp.vertebrate:* 
sp.unclassified:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution, 



SUMMARIES 



Query 



NO. 


Score 


Match Length 


DB 


ID 


Description 


1 


3895,5 


53.6 


823 


5 


Q9VQ10 


Q9vql0 drosophila 


2 


2462 


33.9 


859 


5 


Q9VPZ6 


Q9vpz6 drosophila 


3 


1790,5 


24.6 


1395 


5 


Q9W213 


Q9w213 drosophila 


4 


1786,5 


24.6 


1395 


5 


044924 


044924 drosophila 


5 


1498 


20.6 


1651 


4 


Q9Y6N7 


Q9y6n7 homo saplen 


6 


1489,5 


20.5 


1612 


11 


089026 


089026 mus musculu 


7 


1455.5 


20.0 


1651 


11 


055005 


055005 rattus norv 


8 


1361.5 


18.7 


1273 


5 


044928 


044928 caenorhabdi 


9 


1318.5 


18.1 


1060 


11 


Q9QZI3 


Q9qzi3 rattus norv 


10 


1256 


17.3 


1344 


11 


Q9Z2I4 


Q9z2i4 mus musculu 


11 


1230 


16.9 


232 


5 


Q9VQ07 


Q9vq07 drosophila 


12 


783.5 


10.8 


166 


5 


Q9VQ09 


Q9vq09 drosophila 


13 


717 


9.9 


423 


5 


P91572 


P91572 caenorhabdi 


14 


658 


9.0 


874 


5 


001632 


O01632 caenorhabdi 


15 


603 


8.3 


1026 


11 


Q62845 


Q62845 rattus norv 


16 


602 


8.3 


1028 


11 


Q62682 


Q62682 rattus norv 


17 


597.5 


8.2 


1493 


11 


P97798 


P97798 mus musculu 


18 


595,5 


8.2 


1427 


13 


Q91562 


Q91562 xenopus lae 


19 


593 


8.2 


1232 


13 


Q90284 


Q90284 carassius a 



20 


592 8 


.1' 1151 


11 


Q9QVN5 


21 


589,5 8 


,1 1377 


11 


P97603 


22 


581 8 


.0 1028 


11 


Q07409 


23 


579.5 8 


.0 1277 


13 


Q98902 


24 


578 7 


.9 1264 


5 


P91767 


25 


576.5 7 


.9 2016 


5 


Q9NBA1 


26 


575.5 7 


.9 1461 


4 


Q92859 


27 


575.5 7 


.9 1461 


4 


000340 


28 


575.5 7 


.9 2016 


5 


Q9V4J9 


29 


572 7 


.9 1217 


11 


P97685 


30 


565.5 7 


.8. 1272 


13 


Q90924 


31 


565.5 7 


.8 • 1445 


11 


Q63155 


32 


560 7 


.7' 1443 


13 


Q90610 


33 


553.5 7 


..5 1180 


4 


015051 


34 


552.5 7 


.6' 1280 


13 


Q90933 


35 


551.5 7 


.5 1040 


13 


09W675 


36 


551.5 7 


.6 1302 


5 


061542 


37 


547,5 7 


.5 1166 


11 


Q9QVN4 


38 


545,5 7 


.5 1154 


11 


Q9QVN3 


39 


543,5 7 


.5 1018 


6 


Q28106 


40 


543,5 7 


.5 1028 


4 


Q9UQ52 


41 


543 7 


.5- 1197 


13 


Q90478 


42 


540,5 7 


.4 1028 


11 


P97528 


43 


538 7 


.4 1215 


11 


P97686 


44 


536'. 5 7 


.4 1822 


4 


Q90LT7 


45 


535'. 5 7 


.4 1028 


11 


Q9JMB8 



Q9qvn5 rattus sp, 
P97603 rattus norv 
Q07409 mus musculu 
Q98902 fugu rubrip 
P91767 manduca sex 
Q9nbal drosophila 
Q92859 homo sapien 
O00340 homo sapien 
Q9v4j9 drosophila 
P97685 rattus norv 
Q90924 gallus gall 
Q63155 rattus nor 
Q90610 gallus gall 
015051 homo sapien 
Q90933 gallus gall 
Q9w675 brachydanic 
061542 drosophila 
Q9qvn4 rattus sp. 
Q9qvn3 rattus sp, 
028106 bos taurus 
Q9uq52 homo sapien 
Q90478 brachydanio 
P97528 rattus norv 
P97686 rattus norv 
Q9ult7 homo sapien 
Q9jmb8 mus musculu 



RESULT 1 
Q9VQ10 

ID Q9VQ10 PRELIMINARY; PRT; 823 AA. 

AC Q9VQ10; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE CG5481 PROTEIN,: 

GN CG5481. 

OS Drosophila melanogaster (Fruit fly), 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI.TaxID-7227; 

RN [1] 

RP SEQUENCE PROM N.A. 

RC STRAIN-BERKELEY; 

RX MEDLINE-20196006; PubMed-10731132; 

RA Adams M.D.v Celniker S.E., Holt R.A,, Evans C.A., Gocayne J.D., 

ra Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A. , Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G./Wortman J.R., Yandell M,D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M., Pfeiffer B.D., 

RA Wan K.H., Doyle C., Baxter E.G., Belt G., Nelson C.R., Miklos G.L.G, , 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D., 

ra Ballew R.M., Baau A., Baxendale J,, Bayraktaroglu L,, Beasley E.M., 

ra Beeson K.Y., Benos P. v., Berman B.P., Bhandari D,, Bolshakov S,, 

RA Borkova D,, Botchan M,R,, Bouck J,, Brokstein P., Brottier P., 

RA Burtis K.C., Bu3am D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Dalcher A., Deng Z,, Mays A.D., Dew I,, Dietz S.M., 

RA Dodson K. , Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P,, 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W., 

RA Fosler C, Gabrlelian A.E., Garg N.S., Gelbart W.M., Glasser K, , 

RA Glodek A., Gong F., Gorrell J.H., Gu Z,, Guan P., Harris M. , 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R,, Houck J., 

RA Hostin D., Houston R.A., Rowland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M,, Kalush F., Karpen G.H., Ke Z,, Kennison J, A,, Ketchum R.A, , 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai I,, 

RA Lasko P., Lei Y., Levitsky A.A., Li J., Li Z., Liang Y. , Lin X., 

RA Liu X., Mattel 3., Mcintosh T.C., McLeod M.P., Mcpherson D,, 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A,, 
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RA Mount S.M., Moy M., Murphy B., Murphy L, , Muzny D.M., Nelson D.L., 
RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 
RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Purl V,, Reese M.G., 
RA Reiner t K., Remington L, Saunders R.D.C., Scheeler F,, Shen H., 
RA shue B.C., siden-Kiamos I., Simpson M,, Skupski M.P,, Smith T., 
RA Spier E., Spradling A.C., Stapleton M., Strong R,, Sun E., 
RA Svirskas R., Tector C, Turner R., Venter E., Wang A.H., Wang X., 
RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 
RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., 
RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L., 
RA Zheng X.H., Zhong F.N,, zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 
RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 
RT "The genome sequence of Drosophila melanogaster."; 
RL Science 287:2185-2195(2000). 
DR EMBL; AE003586; AAF51373.1; -. 
DR HSSP; P56276; 1TLK. 
DR FLYBASE; FBgn0031341; CG54B1. 
DR INTERPRO; IPR001412; ■. 
DR INTERPRO; IPR001777; -. 
DR INTERPRO; IPR003006; -. 
DR PFAM; PF00041; fn3; 1. 
PFAM; PF00047; ig; 5. 
PRINTS; PR00014; FNTYPEIII, 
PROSITE; PS00178; AA_TRNA_LIGASE_I; UNKN0WNJ, 
SEQUENCE 823 AA; 89715 MW; 36FC0B91F36F2F19 CRC64; 



Query Match 53.6%; Score 3895.5; DB 5; Length 823; 

Best Local Similarity 91.7*; Pred. No. 1.4e-266; 

Matches 759; Conservative 0; Mismatches 2; Indels 67; Gaps 3; 

Qy 11 MDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFLKVIHSRR 70 

IIIIIIMIIIIIIIIIIIIIIIIIIllllllllllllllllllllllllllllllllll 

Db 1 MDTTVPRNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFLKVIHSRR 60 
Qy 71 ESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVALMECGAPRGSP 130 

IMIMIIIIIIIMIIIIIIIIIIIII I ! 1 1 1 1 1 1 ! 1 1 1 M 1 1 1 1 1 ! 1 1 1 1 M I < I M 

Db 61 ESDAGTYWCEAKNEFGVARSRNATLQyAFLRDEFRLEPANTRVAQGEVALMECGAPRGSP 120 



Qy ^31 EPQISWRKNGQTLNLVGNKRIRIVDGSNLAIQEARQSDDGRYQCvTKNWGTRESATAFL 190 

IIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIII 
Db 121 EPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCVVKNVVGTRESATAFL 180 

Qy 191 KVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNMPLRKFSWLHSAS 250 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIII 

Db 181 KVHVRPFLIRGPQNQTAVVGSSWFQCRIGGDPLPDVLWRRTASGGNMP 229 

Qy 251 GRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPRFVIRPKNQLVEI 310 

lllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 

230 RRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPKFVIRPKNQLVEI 289 



311 GDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTPEGRSVLSIARFAREDS 370 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIII 

Db 290 GDF^FECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTPEGRSVLSIARFAREDS 349 

Qy 371 GKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTLPVKSIWLPCRTLGTPV 430 

IIIIMMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIII 

Db 350 GKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTLPVKSIWLPCRTLGTPV 409 
Qy 431 PQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRNGKSSWSGYLR 490 

IIIIIIIIIIMINIIIIIIIIIIIIIIMIIIMIIIIIIIIIIIIIlllllllllll 

Db 410 PQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRNGKSSWSGYLR 469 

Qy 491 LDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTLSWTRSNPGGSSLVGYVIEMF 550 

IIIIIIIMIIIIIIIIIIIIIIIMIIIIMIIIIIIIIIIIMIIIIIIIIIIIIIM 
Db 470 LDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTLSWTRSNKVGGSSLVGYVIEMF 529 

Qy 551 GKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGT— 607 

HiMiiiiiiiiiiimiiiiiiiiiiiiiiiimmiimiiiiiimii 

Db 530 GKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEPITVGTVSS 589 



■ -RYFNSGLDLSEARASLLSGDWELSNASWDSTSMKL 644 

IIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIII 



Db 590 ENHSFTLMFPSLIHYFDSLFHPQRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKL 649 
Qy 645 TW : QIINGKYVEGFYVYARQLPNPIVNNPAP 674 

ii ' miiiiimiimiiimiiiiii 

Db 650 TWQVCNRLTDGSIAAPHSIAHRHLIRSASFLMQIINGRYVEGFYVYARQLPNPIVNNPAP 709 

Qy 575 VTSNTNPLLGSTSTSASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPLNTKYRML 734 

IIIIIIIIIIMIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIII 
Db 710 VTSNTNPLLGSTSTSASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPLNTKYRML 763 

Qy 735 TILNGGGASSCTITGLVQYTLYEFFIVPFYKSVEGKPSNSRIARTLED 782 

miimmiiiniiiiiiiiiiiiimiiiiimiiiiiii! 

Db 770 TILNGGGASSCTITGLVQYILYEFFIVPFYKSVEGKPSNSRIARTLED 817 



RESULT 2 
Q9VPZ6 

ID Q9VPZ6 PRELIMINARY; PRT; 859 AA. 

AC Q9VPZ6; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE CG5423 PROTEIN (FRAGMENT). 

GN CG5423. 

OS Drosophila melanogaster (Fruit fly). 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBIJTaxID-7227; 

RN [1] 

RP SEQDENCE FROM N.A. 

RC STRAIN-BERKELEY) 

RX MEDLINE-20196006; PubMed-10731132; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M, , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blaze} R.G., Champe M, , Pfeiffer B.D., 

RA Wan K.H., Doyle<C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D,, 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P. v., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C, Busam D.A., Butler H, , Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson R., Doup.L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P,, 

RA Durbin k.j,, Evangelista C.C., Ferraz c, Ferriera s., Fleischmann W., 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., 

RA Glodek A., Gong.F., Gorrell J.H., Gu z., Guan P., Harris M., 

RA Harris N.L., Harvey D,, Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A., Howland T.J., Wei M.-H,, Ibegwam C, 

RA Jalali M., Kalush F., Karpen G.H., Ke I., Kennison J.A., Retchum K.A., 

RA Kimmel B.E., Rodira CD., Kraft C, Kravitz S., Kulp D., Lai z., 

RA Lasko P., Lei Y w Levitsky A. A., Li J., Li Z. , Liang Y., Lin X,, 

RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N,V,, Mobarxy C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M., Murphy B., Murphy L;, Muzny D.M., Nelson D.L., 

RA Nelson D.R,, Nelson K.A., Nixon R,, Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K,, Remington K., Saunders R.D.C., Scheeler p., shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith t., 

RA Spier Spradling A.C, Stapleton M., Strong R., Sun e,, 

RA Svirskas R., Tector C, Turner R., Venter E., Wang A,H, , Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C, Wu D., Yang S., Yao Q.A., 

RA Ye J,, Yeh R.-F,, Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003586; AAP51388.1; -. 

DR HSSP; P56276; 1TLK. 
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DR FLYBASE; FBgn0031328; CG5423. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPRQ03006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5, 

FT NONJER 1 1 

859 AA; 93916 MW; 5CFD69D984101BF8 CRC64; 



Query Match 33.9%; Score 2462; D 

Best Local Similarity 49.8%; Pred. No. 2.8e 
Matches 478; Conservative 158; Mismatches 



i 5; Length 859; 
■165; 

206; Indels 118; Gaps . 12; 



f 



4 PRIIEHPMDTTVPKNDPPTFNCQAEGNPTPTIQWFKDGRELKTDTGSHRIMLPAGGLFFL 63 

Ihllhlllll::: I 1 1 : 1 1 1 : 1 1 1 ! 1 1 1 : 1 1 1 [ llll! Illllllll 
1 PRIVEHPIDTTVPRHEPATLNCKAEGSPTPTIQWYKDGVPLKILPGSHRITLPAGGLFFL 60 

64 KVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVALMEC 123 

ii ii milium nrnm nm 

61 KVSDGRR CAVLRDEFRLEPQNTRIAQGDTALLEC 94 

124 GAPRGSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWKNWGTR 183 

llll Ml ::hl II Ml 1 : 1 1 : III III 1 1 1 1 1 : 1 1 1 : 1 : 1 : 1 1 1 : II I I 
95 AAPRGIPEPTVTWKKGGQKLDLEGSKRVRIVDGGNLAIQDARQTDEGQYQCIAKNPVGVR 154 



Qy 184 ESATAFLKVHVRPFLIRGPQNQTAWGSSVVFQCRIGGDPLPDVLWRRTASGGNMPLRKF 243 

m i iiiiimmii m : mm i imimimi iniimii 

Db 155 ESSLATLKVHVKPniRGPHDQTVLEGASVTFPCRVGGDPMPDVLWLRTASGGNMPL— 211 

Qy 244 SWLHSASGRVHVLEDRSLKLDDVTLEDMGEnCEADNAVGGITATGILTVHAPPKFVIRP 303 
, II llllllhl: II: I llhlllll II III I 1 1 1 : 1 1 1 ! I : || 

Db 212 DRVSVLEDRSLRLERVTIADEGEYSCEADNWGAITAMGTLTVYAPPKFIQRP 264 

Qy 304 KNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRD - GRMEVTLT PEGRSVLSI 362 

Ihl : III: |:|:||::|::: l|:|: I I Ml Ml:: 

Db 265 ASKSVELGADTSFECRAIGNPKPTIFWTIKNNSTLIFPGAPPLDRFHSLNTEEGHSILTL 324 

Qy 363 ARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTLPVKSIWLP 422 

llll M llhl I l::M :|:|:l : lllll IMIIIIhll: I 

Db 325 TRFQRTDKDLVILCNAMNEVASITSRVQLSLDSQEDRPPPIIISGPVNQTLPIKSLATLQ 384 

Qy 423 CRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRNGK 482 

l — lll :lll llll II : h: :| I llll I :|:||||||||:| 

Db 385 CKAIGLPSPTISWYRDGIP--VQPSSKLNITTSGDLIISDLDRQQDQGLYTCVASSRAGK 442 



I 

Qy 

Db 

Qy 
Db 

Qy 

Db 



483 SSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGRPQMVEKGENSVTLSWTRSNKVGGSSL 542 

MINI:: I M 1 1 1 1 1 :! 1 1 1 : I ll:|::: :::|: | |:| | || 
443 STWSGFLRIELPTNPNIKFYRAPEQTKCPSAPGQPKILNATASALTIVWPTSDKAGASSF 502 

543 VGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHGLSLPSPMSEP 602 

:H MM I:: I: : :|: lllll |::||||| I I 1 1 1 : 1 1 ! 
503 LGYSVEMYCTNQSRTWIPIASRLSEPIFTVESLTQGAAYMFIVRAENSLGFSPPSPISEP 562 

603 IT VGTR YFNSGLDLSEARASLLSGDWELSNASWDSTSMKLTWQIINGKY 653 

II IN llll: I : lllll I: III: :|:| I :|:| 

563 ITAGKLVGVRDGSESTGTSQLLLSDVETLLQANDWELLEANASDSTTARLSWDIDSGQY 622 

654 VEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTKPNIAAAGKRD 713 

:||||:|||:| 

623 IEGFYLYARELH 634 

714 G ETNQSGGGAPTPLNTK YRMLT I LN - GGG ASSC T ITGLVQYTLYEFFIVPFY KSVEGKPS 772 

:::|:|:|:|| I I Mil: II : : IIIMMIMM llll 
635 SSEYKMVTLLNKGQGLSSCTVPGLAKASTYEFFLVPFYKSIVGKPS 680 



Qy 773 NSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPE-LKDRHGVLLNYHVIVRGIDTAH 831 

III lllllll Mlllll: I MINI I: : MMI :|:|:|:|:| I 
Db 681 NSRRMRTLEDVPEAPPYGMEAIQFNRTSVFLKWLPPQPNRTRNGILTSYNVYVKGLD-VB 739 

Qy 832 NFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPITKRLD 891 

I :M hllllhllimill II I : III MM: || ||:| |: || 
Db 740 NTTRIFKNMTIDAAAPTLLLANLTTGVTYYIAVAAATRVGVGPFSKPAVLRIDARTQSLD 799 



892 P FINQRDHVNDVLTQPWFIILLGAILAVLMLSFGAMVFVKRKHMMMKQSALNTMRG 947 

: II :l III llhllhhl::: IMI II : :||::| :: I 
800 TGYTRYPISRDIADDFLTQTWFIVLLGSIIAIIVFLLGALVLFKR-YQFIKQTSLGSLHG 858 



RESULT 3 
Q9W213 

ID Q9W213 ■ PRELIMINARY; PRT; 1395 AA. 

AC Q9W213; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel . 15, Last annotation update) 

DE ROBO PROTEIN. ' 

GN ROBO. i 

OS Drosophila melanogaster (Fruit fly) . ■ 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila, 

OX NCBI TaxID=7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BERKELEY; 

RX MEDLINE-20196006; PubMed-10731132; 

RA Adams H.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G.:, Scherer S.E., Li P.W., Hoskins R.A, , Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M,, pfeiffer b.d. , 

RA Wan K.H., Doyle." C, Baxter E.G., Helt G,, Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P. V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis R.C., Busam D.A., Butler H., Cadieu E,, Center A,, Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K. , Doup L.E., Downes M., Dugan-Rocha S,, Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera s., Fleischmann w. , 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 

RA Glodek A., Gong F, , Gorrell J.H., Gu Z,, Guan P., Harris M., 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A., Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M., Kalush F., Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D. ( Lai Z., 

RA Lasko P., Lei Y,, Levitsky A. A., Li J,, Li Z., Liang Y., Lin X., 

RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy'M, , Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri v., Reese M.G., 

RA Reinert K,, Remington K., Saunders R.D.C., Scheeler F, , Shen H., 

RA Shue B.C., Sideo-Kiamos I., Simpson M. , Skupski M.P., Smith T., 

RA Spier E., Spradling A.C., Stapleton M., Strong R., Sun E. ( 

RA Svirskas R., Te'ctor C, Turner R., venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M,, Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S,, Yao Q.A., 

RA Ye J., Yeh R.-f', Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L., 

RA Zheng X.H., Zho'ng F.N., Zhong W,, Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W,, Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003458;' AAF46887.1; -. 

DR HSSP; P56276; 1TLK. 

DR FLYBASE; FBgn0005631; robo. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

DR PRINTS; PR00014; FNTYPEIII. 

SQ SEQUENCE 1395 AA; 151759 MW; 25CED7DEB44F13F0 CRC64; 



Query Match 



24.6%; Score 1790.5; DB 5; Length 1395; 
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Best local Similarity 30.1%; Pred. No. 1.6e-117; 

Matches 441; Conservative 242; Mismatches 507; Indels 275; Gaps 39; 

Oy 2 ENPRIIEHPMDTTVPKNDPFTFNCQAEGKPTPTIQWFKDGRELKT-DTGSHRIMLPAGGL 60 

I I II: I ||: I I |||:||||| : I : III: I I 
Db » 54 QSPRIIEHPTDLWRKNEPATLNCKVEGKPEPTIEWFKDGEPVSTNEKKSHRVQFKDGAL 113 

Oy 61 FFLRVIHSRRES^AGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAOGEVAL 120 

II : : ::| II III III I I l|:|:| |:| !l l|:| hi I : ! 1 1| ,: 1 1 I 

Db 114 FFYRTMQGKKEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETAL 173 

Qy 121 MECGAPRGSPEPQISWRRNG QTLNLVGNRRIRIVDGGNLAIQEARQSDDGRYQC 174 

:MI 1:1 III : I M : :: : MIIIIIM I |:| |:| 
Db 174 LECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYKC 233 

Qy 175 WKNWGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTAS 234 

: :|:IIIIM: I I I l:|: :: : I : I I Mill I III:: , 
Db 234 IAQNLVGTRESSYAKLIVOVKPYFMKEPKDQVMLYGQIATFHCSVGGDPPPKVLWKK--E 291 

Qy 235 GGNMPLRKFSWLHSASGRYHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVH 294 

tll:|: : II :::lh: ::| III III I II hi I II 

292 EGNIPVSRARILH DERSLEISNITPTDEGTYVCEAHNNVGQISARASLIVH 342 
295 APPKFVIRPKNQLVEIGDEVLFECQANGHPRPILYWSVEGNSSLLLPGYRDGRMEVTLIP 354 

III I II I: I : I I 1:1:1 l:::|: II 'hh I, II I 

Db 343 APPNFTKRPSNKKVGLNGWQLPCMASGHPPPSVFWTKEGVSTLMFPNSSHGRQHVA--- 399 

Qy 355 EGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPIIEQGPVNQTLP 414 

II :|| I I 1:1 : I I : I : I : I 1 1 1 1 1 : II Mill 
Db 400 -ADGTLQITDVRQEDEGYYV'CSAFSVVDSSTVRVFLQVSSLDERPPPIIQIGPANQTLP 457 

Qy 415 VKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTC 474 

I: llll I I h: h II II I :: :| : III I I III 
Db 458 KGSVATLPCRATGNPSPRIKWFHDGHA-VQAGNRYSIIQGSSLRVDDLQL-SDSGTYTC 514 

Qy 475 VASNRKGRSSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQMVEXGENSVTLSWTRS 534 

II |::lh I :: I : :: II : llll III |::: |::| I :| 
Db 515 TASGERGETSMAATLTVERPGSTSL--HRAADPSTYPAPPGTPRVLNVSRTSISLRWARS 572 

Qy 535 NKVGGS-SLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSHG 592 

: I: ::|l :| I : lh II :l I =11 II :l Ihlllh I 
Db 573 QEKPGAVGPIIGYTVEYFSPDLQTGWIVAAQRVGDTQVTISGLTPGISYVFLVRAENTQG 632 

Qy 593 LSLPSPMSEPITVGTRYFN—SGLDLSEARASLLSGDWELSNASWDSTSMRLTWQI" 648 

:|:|| :| I 1:1 III II :||:| III :|l :m::::| I : 
Db 633 ISVPSGLSNTIKTIEADFDAASANDLSAAR-TLLTGRSVELIDASAIMSAVRLEWMLHV 691 



649 -INGRYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTRPNIA 707 

: Mill :: : I j 
692 S ADEKYVEG LRI HYKDASVP -1 711 

708 MGKRDGETNQSGGGAPTPLNTRYRMLTIINGGGASSCTITGLVQYTLYEFFIVPFYRSV 767 

: =1 :|::: I I : I :|| llll: II:::: 
712 SAQYHSITVMD-ASAESFWGNLRRYTRYEFFLTPFFETI 750 

768 EGKPSHSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDRHGVLLNYHVIVRGI 827 

Ihlllh I I IHI I :: : I :| :::| I : :| I I : I 
751 EGQPSNSKTALTYEDVPSAPPDNIQIGMYNQTAGWVRWTPPPSQHHNGNLYGYRIEV"- 807 

828 DTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPIT 887 

:| I ::| I : I : : I : :::| III I :|:| : : II III I :| :|| I 
808 -SAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTRAGDGPYSRPISLFMDP-T 865 

888 KRLDP FINQRDH- -VNDVLTQPHFIILLGAI 916 

■: I II I I I: II ::h : 

866 HHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPITHKRTIDYLSGPWLMVLVCIV 925 

917 LAVLMLSFG-AMVFVKRRHMMMRQ-SALNTMRGNHTSDVLRMPSLSARNGNGYWLDSSTG 974 

I ll::| :lh llll I h h : I :|: : hi I 

926 LLVLVISAAISMVYFRRRHQMTRELGHLSWSDN EITALNINSRESLWIDHHRG 979 

975 GMVWRPSPGGDSLEMQRDHIADYAPVCGAPGSPAGGGTSSGGSGGAGSGASGGDDIHGGH 1034 

II : : II III:: 



980 -"MR — vTADTDRD— 



■-SGLSESRLLSHVN 1001 



Qy 1035 GSERNQQRYVGEYSNIP-TDYAEVSSFGRAPSEYGRHGNASPAPYATSSILSPHQQQQQ 1092 

I: I i hi llllll : I :| lllh h : 

Db 1002 SSQSN --YNNSDGGTDYAEVDTRNLTTFYNCRRSPDNPTPYATTMIIGTSSSETC 1054 

Qy 1093 QQPRYQQRPVPGYGLQRPMHPHYQQQQHQQQQAQQTHQQH - -QALQQHQQLPPSNIYQQM 1150 

: ill : I : : I: : : : III 
Db 1055 TRTTSISADRDS'GTHSPYSDAFAGQVPAVPWKSNYLQYPVEPINWSEFLPP 1106 

Qy 1151 STTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIH ITENKLSNCHTYEAAPGARQS 1204 

llll I: :: | : | : :: | : 

Db 1107 PPEHPPPSSTYGYAQGSPESSRRSSKSAGSGISTNQSILNASIHSSSSGGFSA 1159 

Qy 1205 SPISSQFASVRRQQLPP NCSIGRESARFRVLNTDQGRNQQNLLDLDGSSMCY 1256 

:| hi ■II : I I::: hi I 

Db 1160 WGVSPQYAVA — CPPENVYSNPLSAVAGGTQNRYQITPTNQHPPQLPAY 1206 

Qy 1257 NGLADSGCGGSPSPMAMLMSHEDEHALYHTA DGDLDDMERLYVRVDEQQPP 1307 

I :! II: I : : : I : I : : I III 

Db 1207 -FATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPMQPPP 1264 

Qy 1308 QQQQQLIPLV- PQHPAEGHLQSWRNQSTRSSRKNGQECIKEPSELIYAP 1355 

:|:> I II : I :• :| I h :: 

Db 1265 P VPVPKGWYQPVHPNSHPMHPTSSNHQIY QCSSECSD-HSR 1304 

Qy 1356 GSVASERSLLSNSGSGTSSQPAGHN 1380 

I : :| h :: I lh 
Db 1305 SSQSHRRQLQLEEHGSSARQRGGHH 1329 



PRELIMINARY; 



PRT; 1395 AA. 



RESULT 4 
044924 

ID 044924 

AC 044924; 

DT 01-JUH-1998 (TrEMBLrel. 06, Created) 

DT 01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

DT 01-JDN-2000 (TrEMBLrel. 14, Last annotation update) 

DE ROUNDABOUT 1. " 

GN ROBOl . 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI TaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-98117249; PubMed-9458045; 

RA Ridd T., Brose K., Mitchell K.J., Fetter R.D., Tessier-Lavigne M., 

ra Goodman C.S., Tear G.; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

rt novel subfamily ;of evolutionarily conserved guidance receptors,"; 

RL Cell 92:205-215(1998). 

DR EMBL; AF040989; 'AAC38849.1; -. 

DR HSSP; P56276; 1TLR. 

DR FLYBASE; FBgn0005631; robo. 

DR INTERPRO; IPR001777; ■. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

DR PRINTS; PR00014; FNTYPEIII. 

SQ SEQUENCE 1395 -AA; 151778 MW; B820E234A5218983 CRC64; 



Query Match 24.6*; Score 1786.5; DBS; Length 1395; 

Best Local Similarity 30.1*; Pred. No. 3e-117; 

441; Conservative 242; Mismatches 507; Indels 275; ( 



Qy 2 ENPRIIEHPMDTTVPRNDPFTFNCQAEGNPTPTIQMFRDGRELRT-DTGSHRIMLPAGGL 60 

::||||lll I lh! I lh II I IIIMIII : I : III: I I 
Db 54 QSPRIIEHPTLiLWKKNEPATLNCKVEGKPEPTIEWFKDGEPVSTNEKKSHRVQFKDGAL 113 

Qy 61 FFLKVIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVAL 120 
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II : : ::| I I III III I I 1 1 : 1 : 1 1 : 1 1 1 1 1 : 1 1 : 1 1 :||||:|| || 
Db 114 FFYRTMQGKKEODGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETAL 173 



Qy 121 MECGAPEGSPEPQI6WRKNG-': 



:||| 



III 



I hi 



- -QTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQC 174 

IMIIIIIII I 1:1 1:1 



174 lecgppkgipepti twikdgvplddlkamsfgassrvmvdggnllisnvepidegnykc 233 

175 wknwgtresataIlkvhvrpflirgpqnqtawgsswfqcriggdplpdvlwrrtas 234 

: :|:llllll: III I I :jl : :: |::| : I : I I :|||| I III:: 

234 iaqnlvgtressyj&livqvkpyfmreprdqvmlygqtatfhcsvggdppprvlwkk--e 291 



295 APPKFVIRPKNQU 

III I II I 
343 APPNFTKRPS1 



235 GGNMPLRRFSWLHSftSGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVH 294 

11:1: : II \> j::Mh: -I III III I II hi I II 
292 EGNIPVSRARILH-!: hDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVH 342 

U 

IIGDEVIjFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTP 354 

I j I 1:1:1 |:::h II hh I II I 

ilngwqlpcmasgnpppsvfwtkegvstlmfpnsshgrqyva- - - 399 
355 egrsvlsiarfareJsgkwt'Inalnavgsvssrtwsvdtqfelpppiieogpvnotlp 414 

II :]| I I j:| : I I : I : I : I Mil: II ||||| 
Db 400 -ADGTLQITDVRQEf EGYYV-|SAFSWDSSTVRVFLQVSSVDERPPPIIQIGPANQTLP 457 

415 VKSIWLPCRTLGfVPQVStLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTC 474 

h llll 11 h: l| II II I :: :| : III I I III 
458 KGSVATLPCRATGIfSPRIWHDGHA"VQAGNRYSIIQGSSLRVDDLQL-SDSGTYTC 514 



475 VASNRNGKSSWSGf RLDTPlJJPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTLSWTRS 534 

II I::ll: 1 :: li :: II : llll III I::: |::| I :| 

515 TASGERGETSWAA*TVEKPG|TSL--HRAADPSTYPAPPGTPKVLNVSRTSISLRWAKS 572 

535 NKVGGS • • SLVGYmEMFGRIETDGWVAVGTRVQNTTFTOTGLLPGVNYFFLIRAENSHG 592 

: h Mill I J lh II :l I :|l II :| Ihllll: I 

573 QEKPGAVGPIIGYipYFSPDtQTGWIVAAHRVGDTQVTISGLTPGTSYVFIiVRAEIJTQG 632 

593 LSLPSPMSEPITVGJTRYFN- ^GLDLSEARASLLSGDWELSNASWDSTSMKLIWQI ' - 648 

:|:M :| I j M HI II :|hl III Ml :::::::| I : 

633 isvpsglsnvikti|adfdaa!sandlsaar-tlltgksvelidasainasavrlewmlhv 691 



Qy 649 -INGRYVEGFYVYfl 

: Mill 
Db 692 SADEKYVEGLRIH! 

Qy 708 AAGKRDGETNQSGGj 



JPAPVTSNTNPLLGSTSTSASASASASALISTRPNIA 707 



3ASVP-; 



■ 711 



I 

Db 

Qy 

Db 



712 • 



\PTPLlfKYRMLTILNGGGASSCTITGLVQYTLYEFFIVPFYKSV 767 
|:l :|::: I I = I Ml MM: II:::: 
AQYHSITVMD-ASAESFWGNLRKYTKYEFFLTPFFETI 750 



768 EGKPSNSRlARTLElVPSEAffiGMEALLLNSSAVFLKWKAPELKDRHGVLLNYHVIVRGI 827 

Ihllll: I I Jill 1 =: MM "M I : M I I M 
751 EGQPSNSKTALTYEbVPSAPPDNIQIGMYKQTAGWVRWTPPPSQHHNGNLYGYKIEV- - - 807 



jInvtId? 



828 DTAHNFSRILINVflDAASPTLVLAlILTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPIT 887 

M I =M hhM : ::M III I MM : : II III I M Ml I 
808 -SAGNTMKVLANMTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSRPISLFMDP-T 865 

888 KRLDP FINQRDH- -VNDVLTQPWFIILLGAI 916 

M II I I |: || M: : 

866 HHVHPPRAHPSGTHDGRHEGQDLTYHNNGNIPPGDINPTTHKKTTDYLSGPWLMVLVCIV 925 

917 LAVLMLSFG - AMVFVKRKHMMMRQ • SALNTMRGNHTSDVLKMPSLSARNGNGYWLDSSTG 974 
I lh:) Ml: llll I h h : I :: M: : IM I 

926 LLVLVI S AA I SMVYFKRKHQMTKELG HLS WSDN EITALNINSKESLWIDHHRG 979 

I 

975 GMVWRPSPGGDSLEMQKDHIADYAPVCGAPGSPAGGGTSSGGSGGAGSGASGGDDIHGGH 1034 

II : II III:: 
980 —WR TADTDKD SGLSESRLLSHVN 1001 

1035 GSERNQQRYVGEYstlP- -TDYAEVSSFGKAPSEYGRHGNASPAPYATSSILSPHQQQQQ 1092 

h I IM Mill : I M Mh h : 

1002 SSQSN YNNSDGGTDYAEVDTRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETC 1054 



Qy 1093 QQPRYQQRPVPGYGLQRPMHPHYQQQQHQQQQAQQTHQQH-QALQQHQQLPPSNIYQQM 1150 
: I I : I : : |: : : MM 



Db 1055 TRTTSISADKDS-GTHSPYSDAFAGQVPAVPWKSNYLQYPVEPINWSEFLPP-- 



Qy 1151 STTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIH ITENKLSNCHTYEAAPGAKQS 1204 

I "'I I I h :: I M : :: I : 

Db 1107 PPEHPPPSSTYGYAQGSPESSRKSSRSAGSGISTNQSILNASIHSSSSGGFSA 1159 

Qy 1205 SPISSQFASVRRQQLPP NCS IGRESARFKVLNTDQGKNQQNLLDLDGSSMC Y 1256 

M IM i II : I I'::: IM I 
Db 1160 WGVSPQYAVA- - - -CPPENVYSNPLSAVAGGTQNSYQITPTNQHPPQLPAY 1206 

Qy 1257 NGLADSGCGGSPSPMAMLMSHEDEHALYHTA DGDLDDMERLYVKVDEQQPP 1307 

I Mil:' I : : : I M : M III 

Db 1207 "FATTGPGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALATPSPMQPPP 1264 

QY 1308 QQQQQLIPLV PQHPAEGHLQSWRNQSTRSSRKNGQECIKEPSELIYAP 1355 

:h • I II M : Mlh :: 

Db 1265 P VPVPEGWYQPVHPNSHPMHPTSSNHQIY QCSSECSD- -HSR 1304 

Qy 1356 GSVASERSLLSNSGSGTSSQPAGHN 1380 

I : M I ' :: I lh 
Db 1305 SSQSHKRQLQLEEHGSSAKQRGGHH 1329 



RESULT 5 
Q9Y6N7 

ID Q9Y6N7 PRELIMINARY; PRI; 1651 AA. 

AC Q9Y6N7; 

DT 01-NOV-1999 (TrEMBLrel. 12, Created) 

DT 01-NOV-1999 (TrEMBLrel, 12, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE ROUNDABOUT 1. 

GN ROBOl. 

OS Homo sapiens (Human). 

OC Eukaryota; Metazba; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID-9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-98117249; PubMed-9458045; 

RA Kidd T., Brose R., Mitchell K.J., Fetter R.D., Tessier-Lavigne M., 

RA Goodman C.S., Tear G.; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily of evolutionarily conserved guidance receptors."; 

RL Cell 92:205-215(1998). 

DR EMBL; AF040990; AAC39575.1; -. 

DR HSSP; P56276; 1TLK. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

"" " " "' 1651 AA; 180928 MW; 9D98CD7CAB73074D CRC64; 



Query Match ' 20.6*; Score 1498; DB 4; Length 1651; 

Best Local Similarity 28.9*; Pred. No. 9,5e-97; 

Matches 410; Conservative 221; Mismatches 526; Indels 260; Gaps 

Qy 4 PRIIEHPMDT'TVPKNDPFTFNCQAEGNPTPTIQWFKDGRELKTDTG- * -SHRIMLPAGGL 60 

IIIMII MM M I MM IIIIIMM I M 1 1 1 : : 11 : 1 I 
Db 58 PRIVEHPSDLIVSKGEPATLNCRAEGRPTPTIEWYRGGERVETDKDDPRSHRMLLPSGSL 127 

Qy 61 FFLKVIHSRR-'ESDAGTYWCEARNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVA 119 

lll:::| h' I I M hi Ml IIMMIMIIMI h: II II I 
Db 128 FFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 187 

Qy 120 LMECGAPRGSP'EPQISWRRNGQTLNLVGNKRIRI-VDGGNLAIQEARQSDDGRYQCWKN 178 

-MM III >ll 1 1 1 : 1 : 1 h M II MM I IMI IM II I 

Db 188 VMECQPPRGHPEPTISWKKDGSPLD- • ■DKDERITIRGGRLMITYTRKSDAGKYVCVGTN 244 

Qy 179 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

Ml III I II II :M I II IM Ml III: M 
Db 245 MVGERESEVAELTVLERPSFVRRPSNLAVTVDDSAEFKCEARGDPVPTVRWRK-DDGEL 302 



Mon Jan 22 13:04:29 2001 
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0y 239 PLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPK 298 

I I : :| :M: II III III hi II h III II 

Db 303 P KSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPH 352 

Qy 299 FVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLL— PGYRDGRMEVTLTPE 355 

H::|::|:| :| I hhl hhl ::l l|: :|| I I |: I : 
Db 353 FYVKPRDQVVALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGD 412 

Qy 356 GRSVLSIARFAREDSGRVWCNALNAVGSVSSRTVVSV-DTQFELPPPIIEQGPVNQTLP 414 

hi I I I : I II II: :: : I I : |||:| HUM: 
Db 413 — LTITNVQRSDVGyYI-CQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVA 467 

Qy 415 VKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTC 474 

I II I hill : I II: : I: : I : I I I : I I III 

Db 468 VDGTFVLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLEN-GVLQIR-YAKLGDTGRYTC 525 

Qy 475 VASNRNGKSSWSGYLRLD TPTNPNIRFFRAPELSTYPGPPGRPQMVERGEN 525 

:M :|:::M h : 1 1 : 1 1 : I I II:: : I 

Db 526 IASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNL IPSAPSKPEVTDVSRN 575 

Qy 526 SVTLSWTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLI 585 
Wk :IHII : I h: hll I I I h I II I I II: 

■> 576 TVTLSW-QPNLNSGATPTSYIIEAFSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLV 634 

Qy 586 RAENSHGLSLPSPMSEP I - TVGTRYFNSGLDLSEARASLLSGDWELSNASWDSTSMKL 644 

II h:hl II :hh I : 1:1 : : I I: | :|: |:|::: 

Db 635 RAANAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQRE-LGNAVLHLHNPTVLSSSSIEV 693 

Qy 645 TWQI - INGRYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTR 703 
I : :h:h : I 

Db 694 HWTVDQQSQYIQGYKILYR 712 

Qy 704 PNIAMGKRDGETKQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVPF 763 

:! Ih: II :| I I : II II 

Db 713 — PSGANHGESDWLVFEVRTP AKNSWIPDLRRGVNYEIKARPF 754 

Qy 764 YKSVEGKPSNSRIARTLEDVPSEAPYGMEALLL-NSSAVFLKWKAPELKDRHGVLLNYH 821 

: :| I : hllh II I h I :|: : h I ::|:: I 
Db 755 FNEFQGADSEIRFMTLEEAPSAPPQGVTVSRNDGNGTAILVSWQPPPEDTQNGMVQEYR 814 

Qy 822 VIVRGIDTAHNFSRILTNVTIDAASKTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATL 881 

I I :| :: I hi :: t:|: I h hi III II I I : 
Db 815 VWCLGNETRYHI NKTVDGSTF5WIPFLVPGIRYSVEVAASTGAGSGVKSEPQFI 869 

Qy 882 RLDPITKRLDP - • FINQRDHVNDVLlipWFI ILLGAI LAVLMLSFGAMVFVKRKHMMMKQ 939 

:|| = I := : = lh \\ II :|| :::: I :: :| : 
Db 870 QLDAHGNPVSPEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIWLY—RHRKKRN 926 

Qy 940 SALNTMRGNHTSDVLKMPSLS ■ ARNGNG YWLDSSTGGMVWRPSPGGDSLEMQKD 992 

»' :| I : hll : I I hll II I : 

927 GLTSTYAG— -^RKVPSFTFTPTVTYQRGGEAV—SSGG— RPGLLNISEPAAQP 974 

Qy 993 HIADYAPVCGAPGSPAGGGTSSGGSGGAGSG — ASGGDDIHGGHGSERNQQRYV — 1044 

:M I I : : hi : I : I I : |:| : 
Db 975 WLADTWPNTGNNHtnXSISCCTAGNGNSDSNLTTYSRPADCIAWrNNQLDNKQTNLMLPE 1034 

Qy 1045 — GEYSNIPTDYAEVSSFGKAPSEYGRHGNAS ■ -PAPYATSSILSPHQQQQQQQPRYQ 1098 

I: :: h :| MINI lllh :: : 
Db 1035 STVYGDV-DLSNKINEMRTFNSPNLRDGRFVNPSGQPTPYATTQLIQSNLSNNMNN — 1089 

Qy 1099 QRPVPGYGLQRPMH- -PHYQQQQH- - -QQQQAQQTHQQHQALQQHQQLPPSNIYQQMSTT 1153 

II II hi I ::::::: :||: I I 
Db 1090 GSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYNQS— 1141 

Qy 1154 SEIYPTNTGPSRSVYSEQYYYPRDRQRHIHITENKLSNCHTYEAAPGARQSSPISSQFAS 1213 

I III I II : II II 

Db 1142 •■•YDQNTGGS YNSSD RGSSTSGSQG" 1164 

Qy 1214 VRRQQLPPNCSIGRESARFKVLNTDQGRNQQNLLDLDGSSMCYNGLADSGCGGSPSPMAM 1273 

:: II : I I :| III 
Db 1165 HRRGARTPRVPRQGGMNWADLL PPPPAH 1192 

Qy 1274 LMSHEDEHALYHTADGDLDD MERLYVKVDEQQPPQQQQQLIPLVPQHPAEGHL 1326 



I : ; : I I hh: II : : :: I I : 

1193 PPPHSNSEEYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPTPPVRGAASSPAA 1252 



Qy 1327 QSWRNQSTRSSRRNGQE CIREPSELIYAP 1355 

I: :IM :; : II I '\ : : I 

Db 1253 VSYSHQSTATLTPSPQEELQPMLQDCPEETGHMQHQP 1289 



RESULT 
089026 



089026 PRELIMINARY; 
089026; 

01-NOV-1998 (IrEMBLrel . 
01-NOV-1998 (IrEMBLrel. 



PRT; 1612 AA, 



. Created) 
Last sequence update) 



01-OCT-2000 (IrEMBLrel. 15, Last annotation update) 
DtJTTl PROTEIN. " 
ROBOl OR DOTTl . •. 

Mus musculus (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebra ta; Euteleostoml; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
NCBI TaxID-10090; 
[1] 

SEQUENCE FROM N i A . 
TISSUE-BRAIN; , 

Wu M.C., Lowe Ni, Fordham R., Rabbitts P.; 

"The mouse homoiogue of human DUTTl/H-robol gene: protein sequence and 

chromosomal location . " ; 

Submitted (JUL-1998) to the EMBL/GenBank/DDBJ databases. 

EMBL; Y17793; CAA76850.1; -. 

HSSP; P56276; 1TLK. 

MGD; MGI: 1274781; Robol. 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003006; -. 

PFAM; PF00041; fn3; 3. 

PFAM; PF00047; ig; 5, 

" 1612 AA; 176406 MW; 5F2988C544796B4B CRC64; 



Query Match 20.5%; Score 1489.5; DB 11; Length 1612; 

Best Local Similarity 28 . 9%; Pred. No, 3.6e-96; 

Matches 402; Conservative 231; Mismatches 516; Indels 243; Gaps 

Qy 4 PRI IEHPMDTIVPRNDPFTFNCQAEGNPTPTIQWFRDGRELRTDTG • ■ ■ SHRIMLPAGGL 60 

Ihlll I 'I I :| I hill 11111:1:1 I ::|| ll|::lhl I 
Db 29 PRIYEHPSDLIVSRGEPATLNCRAEGRPTPTIEWYKGGERVETDRDDPRSHRMLLPSGSL 88 

Qy 61 FFLKVIHSRR'ESDAGTYWCEARNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVA 119 

M:::| I I I I hi III 1 1 : 1 : 1 1 : 1 1 1 : 1 1 h: II II I 
Db 89 FFLRIVHGRKSRPDEGVYICVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 148 

Qy 120 LMECGAPRGSPEPQISWRRNGQTLNLVGNKRIRI-VDGGNLAIQEARQSDDGRYQCWKN 178 

:||| III III 1 1 1 : 1 : 1 h :| II : II I I hll hi II I 
Db 149 VMECQPPRGHPEPTISWRRDGSPLD- - -DKDERITIRGGRLMITYTRKSDAGKYVCVGTN 205 

Qy 179 WGIRESAIAE'LKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

:|| III M I II :: I I I I hi Ih I Ih I : 
Db 206 MVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRK-DDGEL 263 

Qy 239 PLRRFSWLHSASGRVHVLEDRSLRLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPR 298 

I ■ I : : :||: II III III hi II h III II 

Db 264 P "RSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPH 313 

Qy 299 FVIRPRNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLL" -PGYRDGRMEVTLTPE 355 

!l::h:hl':| I hhl hhl ::| Ih :M I 

Db 314 FWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGD 373 

Qy 356 GRSVLSIARFAREDSGRWTCNALNAVG'SVSSRTWSV-DTQFELPPPIIEQGPVNQTLP 414 

hh 'I I I : I II Ih :: ': I I : llhl llllllh 
Db 374 LT ITNVQRSDVGYYI - CQTLNVAGS I ITKAYLEVTDVIADRPPPV IRQGPVNQTVA 428 

Qy 415 VKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTC 474 

I ::| I h! i : ! Ih : !: : | ::| I I : I I III 
Db 429 VDGTLILSCVf.TGSPAPTILWRKDGVLVSTQDSRIKQL-ESGVLQIR-YAKLGDTGRYTC 486 
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Qy 475 VASNRNGKSSWSGYLRLD TPTNPNIKFFRAPELSTYPGPPGKPQMVEKGEN 525 

II :h::|l h : M : 1 1 : I I II:: : :| 

Db 487 TASTPSGEATWSAYIEVOEPGVPVQPPRPTDPNL IPSAPSKPEVTDVSKN 536 

Qy 526 SVTLSWTRSNKVGGSSLVGYVIEMFGRNETDGWVAVGIRVQNTTFTQTGLLPGVNYFFLI 585 

:|HII : I h: Ml I I h II II I ||: 
Db 537 TVTLSW-QPNLNSGATPTSYIIEAFSHASGSSWQTAAEMVKTETPAIKGLKPNAIYLFLV 595 

Qy *586 RAENSHGLSLPSPMSEPI-TVGTRYFNSGLDLSEARASLLSGDW-ELSNASWDSTSMK 643 

II I::|:| IMM: I : |:| : : I |:|| I I ::: |:|:: 
Db 596 RAANAYGISDPSQISDPVKTQDVPPTSQGVDHKQVQREL—GNWLHLHNPTILSSSSVE 653 

Qy 644 LTWQI - INGRYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALIST 702 

: I : :|::|: : I 
Db 654 VHWTVDQQSQYIQGYKILYR 673 

C703 KPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVP 762 
:| II: II II :| I I : II I 

674 PSGASHGESEWLVFEVRTP- -TR NSWIPDLRRGVNYEIRARP 714 

Qy 763 FYRSVEGKPSNSRIARTLEDVPSEAPYGMEALLL-NSSAVFLKWRAPELKDRHGVLLNY 820 

I: :l I : 1 : 1 1 1 : III: I :|: : I: I ::|:: I 
Db 715 FFNEFQGADSEIKFAKTLEEAPSAPPRSVTVSKNDGNGTAILVTWQPPPEDTQNGMVQEY 774 

Qy 821 HVIVRG IDTAHNFSRILTKVT IDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPAT 880 

I I :l :: I hi :: ::|: :l I: hi III II I I 
Db 775 KVWCLGNETRYHI NKTVDGSTFSWIPSLVPGIRYSVEVAASTGAGPGVKSEPQF 829 

Qy 881 LRLDPITKRLDP - -FINQRDHVNDVLTQPWIILLGAILAVLMLSFGAMVFVKRKHMMMR 938 

::|| : I :: ::||: II II :|| :::: I :: :| : 
Db 830 IQLDSHGNPVSPEDQVSLAQQISDWRQPAFIAGIGAACWIILMVFSIWLY— RHRKKR 886 

Qy 939 QSALNTMRGNHTSDVLRMPSLS ARNGNGYWLDSSTGGMVWRPSPGGDSLEMQR 991 

:| I : Ml : I I Ml II I : 

Db 887 NGLTSTYAG IRKVPSFTFT PTVTYQRGGEAV — SSGG — RPGLLNI SEPATQ 934 

Qy 992 DHIADYAPVCGAPGSPAGGGTSSGGSGGAGSG — ASGGDDIHGGHGSERNQQRYV — 1044 

Mill: : hi : I : I I : |:| : 
Db 935 PWLADTWPNTGNNHNDCSINCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNKQTNLMLP 994 

Qy 1045 GEYSNIPTDYAEVSSFGKAPSEYGRHGNAS-PAPYATSSILSPH 1087 

I: " h :| Ml II I MM: :: : 
Db 995 ESTWGDV-DLSNRINEMRTFNSPNLRDGRFVNPSGQPTPYATTQLIQANLSNNMNNGAG 1053 

^y 1088 -QQQQQQQPRYQQRPVPGYGLQRPMHPHYQQQQHQQQQAQQTHQQHQALQQHQQLPPSNI 1146 

B :: M II: I I: :: M : :||: 

P) 1054 DSSERHWRPPGQQR PEVAPIQYNIMEQNRLNKDYRA— NDTIPPTIP 1098 



Qy 1147 YQQMSTTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIHITENKLSKCHTYEAAPGAKQSSP 1206 

II I III I II: :: Ml II 

Db 1099 YNQS YDQNTGGS YNSSDRGSSTSGSQGHRKGARTPKA— PRQGGM 1141 

Qy 1207 ISSQFASVRRQQLPPNCSIGRESMFRVLNTDQGRNQQNLLDLDGSSMCY 1256 

: ||: : | : :: |: :|: : : | 

Db 1142 NWADLLPPPPAHPPPHSN — SEEYN-MSVDESYDQEMPCPVPPAPMYLQQDELQEEED 1196 

Qy 1257 -NGIiADSGCGGSPSPMAMLMSHEDEHALYHTADGDLDDMERLYVRVDEQQPPQQQQQLIP 1315 

I M II I: II: I I M I 

Db 1197 ERGPTPPVRGAASSPAAVSYSHQSTATL TPSPQEELQP 1234 



Qy 1316 LVPQHPAE-GHL 1326 

:: I : ||: 
Db 1235 MLQDCPEDLGHM 1246 



RESULT 
055005 
ID 
AC 
DT 
DT 
DT 
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055005 PRELIMINARY; PRT; 1651 AA. 
055005; 

01-JUN-1998 (TrEMBLrel. 06, Created) 

01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 



DE TRANSMEMBRANE R 

OS Rattus norvegicus (Rat). 

OC Eukaryota; Metazoa; Chordata; Cranlata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

OX NCBI TaxID-10115; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE-SPINAL CORD; 

RX MEDLINE-98117249; PubMed-9458045; 

RA Ridd T, , Brose K., Mitchell R.J., Fetter R.D., Tessier-Lavigne M. , 

RA Goodman C.S., Tear G.; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily. of evolutionarily conserved guidance receptors."; 

RL Cell 92:205-215(1998). 

DR EMBL; AF041082; 'AAC39960.1; -. 

DR HSSP; P56276; 1TLR. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

RW Transmembrane. : 

SQ SEQUENCE 1651 AA; 180746 MW; FA2452DD46E186B7 CRC64; 



Query Match . ; 20.0%; Score 1455.5; db 11; Length 1651; 

Best Local Similarity 28.8%; PrecJ. No. 9.5e-94; 

Matches 406; Conservative 229; Mismatches 525; Indels 251; Gaps 45; 

Qy 4 PRIIEHPMDTTVPRNDPFTFNCQAEGNPTPTIQWFRDGRELRTDTG- ■ -SHRIMLPAGGL 60 

IIIMM I .1 I M I IIMII IMIIMM I M 1 1 1 : : 1 1 : 1 I 
Db 68 PRIVEHPSDLIVSRGEPATLNCRAEGRPTPTIEWYRGGERVETDRDDPRSHRMLLPSGSL 127 

Qy 61 FFLKVIHSRR-ESDAGTYWCEARNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVA 119 

II!::: !: I I I I |: Ml MMMIMIIMI I- II II I 
Db 128 FFLRIVHGRKSRPDEGVYICVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 187 

Qy 120 LMECGAPRGSPEPQISWRRNGQTLNLVGNRRIRI-VDGGNLAIQEARQSDDGRYQCWRN 178 

MM III III IIIMM I: M II : II I I IMI IM II I 
Db 188 VMECQPPRGHPEPTISWKKDGSPLD- - -DKDERITIRGGKLMITYTRKSDAGKYVCVGTN 244 

Qy 179 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

Ml III I:: III :: II II IM MM II: I : 
Db 245 MVGERESRVADVTVLERPSFVKRPSNLAVTVDDSAEFRCEARGDPVPTFGWRR • • DDGEL 302 

Qy 239 PLRKFSWLHSASGRVHVLEDRSLRLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPPK 298 

I ' I : M Ml: II III III IM II I: III II 

Db 303 P -RSRYEIRDDHTLRIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPH 352 

Qy 299 FVIRPRNQLVSIGDEVLFECQANGHPRPTLYWSVEGNSSLLL— P6TRDGRMEVTLTPE 355 

lh:|::|:M I IMM IMM ::| II: Ml I MM: 
Db 353 FWRPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGD Vil 

Qy 356 GRSVLSIARFAREDSGRWTCNALNAVGSVSSRTWSV-DTQFELPPPIIEQGPVNQTL? i'A 

I:: '\ I I M II II: :: : I I : MM MUM 
Db 413 — -LTVTNVQRSDVGYYI-CQTLNVAGSIITRAYLEVTDVIADRPPPVIRQGPVNQTVA 467 

Qy 415 VRSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTC 474 

I Ml" IMM M M : |: : I ::| I I : I I III 

Db 468 VDGTLTLSCVATGSPVPTILWRKDGVLVSTQDSRIRQL-ESGVLQIR-YARLGDTGRYTC 525 

Qy 475 VASNRNGKSSWSGYLRLD TPTNPNIRFFRAPELSTYPGPPGRPQMVEKGEN 525 

II M::: : ll I: : IIMh I I II" : M 

Db 526 TASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNL IPSAPSRPEVTDVSKN 575 

Qy 526 SVTLSWTRSNSVGGSSLVGYVIEMFGRNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLI 585 

Mil I : I- I:: IMI I II IMI III I II: 
Db 576 TVTLLW-QPNLNSGATPTSYIIEAFSHASGSSWQTVAENVKTETFAIKGLRPNAIYLFLV 634 

Qy 586 RAENSHGLSLPSPMSEPI-TVGTRYFNSGLDLSEARASLLSGDW-ELSNASWDSTSMR 643 

II IMM .11 MM: | IM : : I IMI I I ::: IM" 

Db 635 RAANAYGISDPSQISDPVKTQDVPPTTQGVDHRQVQREL-GNWLHLHNPTILSSSSVE 692 

Qy 644 LTWQI -INGRYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALIST 702 
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: I : :|::|: : I 
693 VHWTVDQQSQYIQGYKILYR-- 



Qy 703 KPNIAMGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVP 762 

:| II: II II :[ I I : I! I 

Db 713 PSGASHGESEWLVFEVRTP-TK NSWIPDLRRGVNYEIRARP 753 

Qy 763 FYKSVEGKPSNSRIARTLEDVPSEAPYGMEALLL--NSSAVFLKWKAPELKDRHGVLLNY 820 

I: :l I : hllh II I : I :|: : h I "h: I 
Db 754 FFNEFQGADSEIKFAKILEERPSAPPRSVTVSKNDGNGTAILVTWQPPPEDTQNGMVQEY 813 

Qy 821 HVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPAT 880 

I hi:: | |:| ::|: | : |:| I I I I 

Db |814 KVWCLGNETRYHI NRTVDGSTFSWIPFLVPGIRYSVEVAASTGAGPGVKSEPQF 868 

Qy 881 LRLDPITKRLDP^-FINQRDHVNDVLIQPWFIILKAILAVL^SFGAMVFVKRKHMMK 938 

::|| : I :: ::lh II II :|| I " II I 

Db 869 IQLDSHGNPVSPEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIWLYRHRK — R 924 

Qy 939 QSALNTMRGNHTSDVLRMPSLS ARNGNGYWLDSSTGGMVWRPSPGGDSLEMQR 991 

:: h: : : hll : I I hll II I : 

Ib 925 RNGLSST— -YAGIRKVPSFTFTPTVTYQRGGEAV— -SSGG— RPGLLNISEPATQ 973 
y 992 DHIADYAPVCGAPGSPAGGGTSSGGSGGAGSG-— ASGGDDIHGGHGSERNQQRYV— 1044 

:|| I I : : :| : I : I I : hi : 
Db 974 PWLADTWPNTGNSHNDCSINCCTASNGNSDSNLTTYSRPADCIANYNNQLDNKQTNLMLP 1033 

Qy 1045 GEYSMPTDYAEVSSFGKAPSEYGRHGNAS--PAPYATSSILSPHQQQQQQQPRY 1097 

I: :: h :| : II I I I MM: :: : 
Db 1034 ESTVYGDV-DLSNRINMTFNSPNLRDGRFVNPSGQPTPYATTQLIQANLINNMNN-" 1089 

Qy 1098 QQRPVPGYGLQRPMHPHYQQQQHQQQQAQQTHQQHQALQQHQQLPPSNIYQQMSTTSEIY 1157 

II ::| : II :| I h' : : : : I 

Db 1090 GGG DSSEKHWRPPGQQ--RQEVAPIQYNIMEQNKLNRDYRANDTIL 1133 

Qy 1158 PTNTGPSRSVYSEQYYYPRDKQRHIHITENRLSNCHTYEAAPGAKQSSPISSQFASVRRQ 1217 

II ,11 hh I :| I : :| 

Db 1134 PT IPYN HS YDQNTGGS YNS - ■ SDRGSSTSGS 1162 

Qy 1218 QLPPNCSIG-RESARFKVLNTDQGKNQQNLLDLDGSSMCYNGLADSGCGGSPSPMAMLMS 1276 

I I :: II I I HI III 

Db 1163 Q GHRKGARTPRAPRQGGMNWADLL PPPPAHPPP 1195 

Qy 1277 HEDEHALYHTADGDLDD MERLYVRVDEQQPPQQQQQLIPLVPQHPAEGHLQSW 1329 

I : : I I hh: || : : :: | | : |: 

Db 1196 HSNSEEYSMSVDESYDQEMPCPVPPARMYLQQDELEEEEAERGPTPPVRGAASSPAAVSY 1255 

Qy 1330 RNQSTRSSRKNGQECIKE PSELIYAP 1355 

:MI : : II :: I :l : I 
gSja 1256 SHQSTATLT PS PQEELQPMLQDCPEDLGHMP 1286 



RESULT 8 
044928 

ID 044928 PRELIMINARY; PRT; 1273 AA. 

AC 044928; 

DT 01-JUN-1998 (TrEMBLrel. 06, Created) 

DT 01-JUN-1998 (TrEMBLrel, 06, Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel, 14, Last annotation update) 

DE SAX-3, 

GN SAX-3. 

OS Caenorhabditis elegans, 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI TaxID-6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-98117250; PubMed-9458046; 

RA Zallen J. A., Yi B.A., Bargmann C.I.; 

RT "The conserved immunoglobulin superfamily member SAX-3/Robo directs 

RT multiple aspects of axon guidance in C. elegans."; 

RL Cell 92:217-227(1998). 

DR EMBL; AF041053; AAC38848.1; -. 



HSSP; P56276; 1ILK. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 3, 
PFAM; PF00047; ig; 5. 
PRINTS; PR00014; FNTYPEIII . 
SEQUENCE 1273? AA; 139427 MW; 



013E766B51A7BAD7 CRC64; 



Query Match 18.7%; Score 1361.5; DB 5; Length 1273; 

Best Local Similarity 28.4%; Pred. No. 2.8e-87; 

Matches 357; Conservative 202; Mismatches 437; Indels 259; Gaps 37; 

Qy 4 PRIIEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGREL— KTDTGSHRIMLPAGGL 60 

Mlllhl J I : I I II h : I I hllh : I MM: 
Db 31 PVIIEHPIDVWSRGSPATLNCGAKPS-TAKITWYKDGQPVITNKEQVNSHRIVLDTGSL 89 

Qy 61 FFLKVIHSR- - RESDAGT YWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEV 118 

I III : i::|||l hi I II I M :h:|:||::||: I : lh 
Db 90 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGEM 149 

Qy 119 ALMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWKN 178 

|::|| I ! I" M I Mill: : I : I : III I Ml I Mil I 
Db 150 AVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 209 

Qy 179 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

Ml I MM I M : h: I 1 1 : : I : I lh III I : hi I 
Db 210 MVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKR--KNEPM 267 

Qy 239 PLRKFSWLHSASGRVHVLED-RSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPP 297 

h " I :: Ml I::: -I I II I I I I I : h I I 1 1 1 

Db 268 PVT RAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPP 317 

Qy 298 KFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGY-RDGRMEVTLTPE 355 

I M M I I III I I I Ml II II I I Ml M: I 
Db 318 SFQTRPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPT- 375 

Qy 356 GRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQF ELPPPIIEQGPV 409 

hi •: I I II M: II h : I h III II I 

Db 376 --GTLTIEEVRQVDEGAYV-CAGMNSAGSSLSKAALKVTTKAVTGNTPAKPPPTIEHGHQ 432 

Qy 410 NQTLPVKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDE 469 

Mil I I MM: | | | Ml IIMIh : I : IM IMh: I 
Db 433 NQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITD-SRISQHSTGSLHIADLKK-PDT 490 

Qy 470 GLYTCVASNRNGKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTL 529 

IMIIM I :hlMI I :: h I M I h I M I M M : I I 
Db 491 GVYTCIAKNEDGESTWSASLTVEDHTS-NAQFVRMPDPSNFPSSPTQPIIVNVTDTEVEL 549 

Qy 530 SWTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAEN 589 

I : I , : MM: : : : | :| ; || I : IMIIM 
Db 550 HWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIRGLKPSHSYMFVIRAEN 609 

Qy 590 SHGLSLPSPMSEPITVGTRYFNSGL DLSEARASLLSGDWELSNASWDSTSMK 643 

I: II I. M I I:: I I I ::M MM" 

Db 610 EKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEEVKTINSTAVR 56? 

Qy 644 LTWQIIN-GKiTEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALIST 702 

I I: :::hh I I Ml : IM 
Db 670 LFWKKRKLEEIiIDGYYIKWR GPPRTNDNQYVNVTSPS 705 

Qy 703 KPNIAAAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVP 762 

■ ! : :: h M IMhM 

Db " 707 TENYWSNLMPFTNYEFFVIP 727 

Qy 763 FYK---SVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLRWKAPELKDRHGVLLN B19 

:: h Mill Mill: Ml : : : lllh MM 
Db 728 YHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWKAPKADGINGILKG 787 

Qy 820 YHVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPY-CV 877 

: ::: I MM IM : : :: I M h I : III M III 
Db 788 FQIVIVG-QAPNNNR— NITTNERAASVTLFHLVTGMTYRIRVAARSNGGVGVSHGTS 842 
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Qy 878 PATLRLDPITKRL DPFIN-- •QRDHVNDVLTQPWFIILLGAILAVLMLSFGAMV 928 

: I : I I : I: : II :|:: III : :: I 

Db 843 EVIMNQDTLEKHLAAQQENESFLYGLINKSHVP VIVIVAILIIFWIIIAYC 894 

Qy 929 FVK RKHMMMKQSALNTMRGNHTSDVLKMPS LSARNGNGYWL 969 

I : : ::: I I: II : |: :: I II I 

Db 895 YWRNSRNSDGKDRSFIKINDGSVH-MSNNLWDVAQNPNQNPMYNTAGRMTMNNRNGQAL 953 

Qy 970 DSST GGMVWRPSPGGDSLEMQKDH I ADYAPVCGAPGSPAGGGT SSGG 1016 

I I hll :l II : I II: 
Db 954 YSLTPNAQDFFNNCDDYSGTMHRPG SEHHYHYAQLTGGPGN 994 

Qy 1017 SGGAGSGASGGDDIHGGHGSERNQQRYVGEYSNIPTDYAEVSSFGKAPSEYGRHGNASPA 1076 

:|:| II : I: 
Db 995 AMSTF YGNQYHDDPS 1009 

Qy 1077 PYATSSILSPHQQ QQQQQPRYQQRPVPGYGLQRPMHP HYQQQQHQQQQA 1125 

m Mil:::: :|l : | ||| | | | :: : : 
B> 1010 PYATTTLVLSNQQPAWLNDKMLRAPAMPTNPVP PEPPARYADHTAGRRSRSSRA 1063 



Qy 1126 QQ 



Db 



-THQQHQALQQHQQLPPSNI-YQQMSTTSEIYPTNTGPSRSVYSEQ 1171 
I: I: I::: II::: III: h 
1064 SDGRGTLNGGLHHRTSGSQRSDSPPHTDVSYVQLHSSD GIGSSKERTGER 1113 



PRT; 1060 AA. 



RESULT 9 
Q9QZI3 

ID Q9QZI3 PRELIMINARY; 
Q9QZI3; 

01-MAY-2000 (TrEMBLrel. 13/ ( 
01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 
01-OCT-200Q (TrEMBLrel . 15, Last annotation update) 
REPULSIVE GUIDANCE RECEPTOR (FRAGMENT). 
Rattus norvegicus (Rat). 

Eukaryota; Metazoa; chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 
NCBI TaxID-10116; 

[1] 

SEQUENCE FROM N.A. 
MEDLINE-99200391; PubMed=10102268; 

Brose K., Bland K.S., Wang K.H., Arnott D., Henzel W., Goodman C.S., 
Tessier-Lavigne M. , Kidd T.; 

"Slit proteins bind Robo receptors and have an evolutionarily 
conserved role in repulsive axon guidance."; 
Cell 96:795-806(1999). 
EMBL; AF182037; AAF0455B.1; ■. 
HSSP; P56276; 1TLK. 
INTERPRO; IPR001547; 

IPR001777; •. 
IPR003006; -. 
PFAM; PF00041; fn3; 3. 
PFAM; PF00047; ig; 5. 
PRINTS; PR00014; FNTYPEIII. 
PROSITE; PS00659; GLYCQSYL_HYDROL_F5 ; UNKNOWNJ. 
Receptor, 

NON.TER 1060 ^1060 

SEQUENCE 1060 AA; 116790 MW; C4BC8C11E8542DA4 CRC64; 



DR 



Query Match 18.1%; Score 1318.5; DB 11; Length 1060; 

Best Local Similarity 30.8%; Pred. No, 2.3e-84; 

Matches 356; Conservative 180; Mismatches 427; Indels 191; Gaps 39; 

Qy 4 PRI I EHPMDTTVPKNDPFTFNCQAEGNPTPT I QWFKDGRELKTDTGSHRIM 54 

h M I : I I I I I : M I III II : I I : 

Db 31 PKXVEQPSEVIVSKGKPNTPNWKQKGRPFPTIGKVQRMVKPGWDKTKDDSKVTQG— CL 87 

Qy 55 LPAGGLFFLKVIHSRR-ESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRV 113 

11:1 IMI:::I II : I III I hi I I 1 1 1 1 : 1 : 1 1 : 1 1 1 : 1 1 I : I 
Db 88 LPSGSLFFLRIVHGRRSKPDEGTYVCVARNYLGEAVSRNASLEVALLRDDFRQNPTDVW 147 

Qy 114 AQGEVALMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRI-VDGGNLAIQEARQSDDGRY 172 

I II h:|l III III I 1:1: :: I lh: II I I |:|| I I 



Db 148 AAGEPAILECQPPRGHPEPTIYWKKDKVRID---EKEERISIRGGKLMISNTRKSDAGMY 204 

Qy 173 QCWKNWGTRESATAFLKVWRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRT 232 

II hll 1:1 III II :l I II : |:|:: III I I !:: 

Db 205 TCVGTNMVGERDSDPAELTVFERPTFLRRPINQWLEDEPAEFRCQVQGDPQPTVRWKK- 263 

Qy 233 ASGGNMPLRKFSWLHSASGRVHVLEDRSULDDVTLEDMGEYTCEADNAVGGITATGILT 292 

"I . II : :l :!:: I I I I |:| II : I: II 
Db 264 -DDADLP— - RGRYDIKDDYTLRIKKAISADEGTYVCIAENRVGKVEASATLT 312 

Qy 293 VHAPPKFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGY-RDGRMEVT 351 

Db 313 vdipPQllvRPR^ CS 37" 

Qy 352 LTPEGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSV- DTQFELPPPI IEQGPVN 4 ! 0 

::! I hi I |:| : I II II: :: : I I : Mill IIIM 
Db 373 VSPTGD--LTITNIQRSDAGYYI-CQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPIN 429 

Qy 411 QTLPVKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEG 470 

III I :| I: I |:| :|| :| : I : I I I I :| I I I 
Db 430 QTLAVDGTALLKCKATG - PLPVI SWLKEGFTF - LGRDPRATIQDQGTLQIKNL - RI SDTG 486 

Qy 471 LYTCVASNRNGKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTLS 530 

Mill:: :|::||| I : I : ; ; |||| |||: : MMMI 
Db 487 TYTCVATSSS3ETSWSAVLDV- - -TESGATISKNYDTNDLPGPPSKPQVTDVTKNSVTLS 543 

Qy 531 WTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENS 590 

I :: I MM I :: :: I I I: I :| II I I I : : 1 1 I 
Db 544 W-QTGTPGVLPASAYIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINP 602 

Qy 591 HGLSLPSPMSEPI-TVGTRYFNSGLDLSEARASLLSGDV-VELSNASWDSTSMKLTWQI 648 

III llllhh I hi :: I III I I II: |::::M : 

Db 603 QGLSDPSPMSDPVRTQDISPPAQGVDHRQVQKEL-GDVTVRLHNPWLTPTTVQVTWTV 660 

Qy 649 -INGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALISTKPNIA 707 

::::|:M II hill : I : |: : 

Db 661 DRQPQFIQGYRVMYRQ TSGLQAST VWQNLDAKVPTERSAV 700 

Qy 708 AAGKRDGETNQSGGGAPTPLNTKYRMLTILNGGGASSCTITGLVQYTLYEFFIVPFYKSV 767 

: I I • II : h: 

Db 701 LVNLKKGVT-' YEIKVRPYFNEF 721 

Qy 768 EGKPSNSRIARTLEDVPSEAPYGMEALLL---NSSAVFLKWKAPELKDRHGVLLNYHVIV 824 

M I h II h II I : I : ||::: : I I ::|:: I : 
Db 722 QGMDSESKTIRTTEEAPSAPPQSVTVLTVGSHNSTSISVSWDPPPADHQNGIIQEYKIWC 781 

Qy 825 RGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLD 884 

I I :l I hll ::h I h I I III Mill I : : 
Db 782 LG NETRFHINKTVDATIRSWIGGLFPGIQYRVEVAASTSAGVGVKSEPQPIIIG 836 

Qy 885 PITKRLDPFINQRDHVNDVLTQPWFIILLGAILAVLMLSFGAMVFVKRKHMM-MK 938 

lh ': I : lh II II M |::: I :: Ml : 
Db 837 GRNEWITENNNSITEQ— ITDWKQPAFIAGIGGACWVILMGFSIWLYWRRKKRKGLS 893 

Qy 939 QSALNTMRGNHTSDVLKMPSLSARNGNGYWLDSSTGGMVWRPS -PG — GDSLEMQKDH 993 

h lh. lh: I || || 

Db 894 NYAVTFQRGD GGLMSNGSRPGLLNTGDP— NYPW 925 

Qy 994 IADYAPVCGAPGSPAGGGTSSGGSGGAGSGASGGDDI- - -HGGHGSERNQQRYVGE'YSN 1049 

Ml I I : : I : h I I h I I : III: 
Db 926 LADSKPATSLPVNNSNSGPNEIGNFGRG DVLPPVPGQGDKTATMLSDGAIYSS 978 

Qy 1050 I PTDYASVSSFGKAPSEYGRHGNASPAPYATSSIl- - -SPHQ QQQQQQPR 1096 

I I I ; I M lllh MM: I : 

Db 979 IDFTTKTTYNSSSQITQA TPYATTQILHSNSIHELAVDLPDPQWKSSV 1026 

Qy 1097 YQQRPVPGYGLQRP 1110 

I : : h ' I 
Db 1027 QQKSDLMGFAYSLP 1040 
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Q9Z2I4 PRELIMINARY; PRT; 1344 AA, 
Q9Z2I4; 

01-MAY-1999 (TrEMBLrel. 10, Created) 
01-MAY-1999 (TrEMBLrel. 10, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
RIG-1 PROTEIN. 
RBIG1. 

Mus musculus (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
NCBI TaxID=10090; 
[1] 

SEQUENCE FROM N.A.p 

Yuan S.-S.F., Cox .A., Dasika G.K., Lee E.Y.-H.P.; 
Submitted (APR-199I) to the EMBL/GenBank/DDBJ databases. 
EMBL; AF060570; M 111628.1; -, 



HSSP; P56276; 1TLK 
MGD; MGI: 1343102 
INTERPRO; IPR00177 
INTERPRO; IPR00300 
PFAM; PF00041; fn3 
PFAM; PF00047; ig, 
1344 AS 



bigl. 



143439 MW; 8B0060341C49CFEA CRC64; 



Query Match lj 17.34; Score 1256; DB 11; Length 1344; 

Best Local Similarity; 27.2%; Pred. No. 8.5e-80; 

Matches 378; Conservative 194; Mismatches 462; Indels 356; Gaps 

2y 4 PRI IEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFRDGRELKT • ■ -DTGSHRIMLPAGGL 60 

111:1 II I : :| I Mill I I l:|:hl : I I MMMIM I 
Db 42 PRIVEQPPDLWSRGEPATLPCRAEGRPRPNIEWYKNGARVATAREDPRAHRLLLPSGAL 101 

61 FFLKVI HSRR * ESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEVA 119 

II :::| II I II I |:| I I 1 1 1 1 M M 1 1 1 1 1 M I II Mill 
102 FFPRIVHGRRSRPDEGVYTCVARNYLGAAASRNASLEVAVLRDDFRQSPGNWVAVGEPA 161 

120 LMECGAPRGSPEPQISWRKNGQTLNLVGNRRIRIVDGGNLAIQEARQSDDGRYQCWKNV 179 

MM hi III ::M : I : : II I : :ll I I II M 
162 VMECVPPKGHPEPLVTWKKG-KIKLKEEEGRITIRGGKLMMSHTFKSDAGMYMCVASNM 219 

180 VGTRESATAFLKVHVRPFLIRGPQNQTAWGSSVVFQCRIGGDPLPDVLWRRTASGGNMP 239 

I III III II :l I II : : I I I : III M = IM I M 
220 AGERESGAAELWLERPSFLRRPINQWLADAPVNFLCEVQGDPQPNLHWRK • ■ DDGELP 277 

240 LRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGGnATGILTVHAPPKF 299 

:|| : III :| M II I III hhll hi Ml 11:1 
278 AGRYEIRSDHSLWIDQVSSEDEGTYTCVAENSVGRAEASGSLSVHVPPQF 327 

300 VIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYRDGRMEVTLTPEGRSV 359 

I MMI I I I 1:1: |:| I ::| II: III :l I II : 

328 VTKPQNQTVAPGANVSFQCETKGNPPPAIFWQKEGSQVLLFPSQ SLQPMGRLL 380 



Qy 360 LSIARFAREDSGKWTCNALNAVGSVSSRTWSV-DTQFELPPPIIEQGPVNQT 412 

t |:| | | | I I:: ||: :: :: : : II III III 

Db 381 VSPRGQLNITF^SJGDGGYYV-CQAVSVAGSILAKALLEIKGASIDGLPPIILQGPANQT 439 

Qy 413 LPVKSIWLPCRTLGTPVPQVSW YLDGIPIDVQEHERRNLSDAGALTISDLQRHE 467 

I : I I Mil :l II : I :| I : : II II I M :| 
Db 440 LVLGSSVWLPCRVIGNPQPNIQWKKDERWLQG DDSQFNLMDNGTLHIASIQ-EM 492 

Qy 468 DEGLYTCVASNRNGKSSWSGYLRLDT PTNPNIKFFRAPELSTYPGPPGK 516 

I I Mill : M::M Ml MM Mil : 

Db 493 DMGFYSCVARSSIGEATWNSWLRKQEDWGASPGPATGPSNP PGPPSQ 539 

Qy 517 PQMVEKGENSVTLSWTRSNKVGGSSLVGYVIEMFGKNEIDGWVAVGTRVQNTTFTQTGLL 576 

I : I IMIMI : I M: MM I : : I .1 II Ml Ml 
Db 540 PIVTEVTANSITLTW-KPNPQSGATATSYVIEAFSQAAGNTWRTVADGVQLETYTISGLQ 598 

Qy 577 PGVNYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGLDLSEARASL LSGDV 628 

I I 11:1 : III IIMIIM MM: |: 

Db 599 PNTIYLFLVRAVGAWGLSEPSPVSEPVQT QDSSLSRPAEDPWKGQRGLAEVA 650 



Qy 629 VELSNASWDSTSMKLTWQIING - -KYVEGFYVYAR QLPNPIVNNPAP 674 

I : :M ::::M :M : MM II I M : : 

Db 651 VRMQEPTVLGPRTLQVSW-TVDGPVQLVQGFRVSWRIAGLDQGSWTMLDLQSP ■ -HKQST 707 

Qy 675 VTSNTNPLLGSTSTSASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPLNTKYRML 734 

I I ' III: I I II M M 

Db 708 VLRGLPP— : GAQIQIKVQV QGQEGLGAESPFVTR — 739 

Qy 735 TILNGGGASSCTITGLVQYTLYEFFIVPFYKSVEGKPSNSRIARTLEDVPSEAPYGMEAL 794 

M M II I M 

Db 740 ; SIP EEAPSGPPQGVAVA 756 

Qy 795 L--LNSSAVFLKWKAPELKDRHGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPTLVLA 852 

I MM ; M I MM: | : | Mil: : :: : 
Db 757 LGGDRNSSVTVSWEPPLPSQRNGVITEYQIWCLG NESRFHLNRSAAGWARSVTFS 811 

Qy 853 NLTEGVMYTVGVAAGNNAGVGPYCVPATLRLD-PITKRLDPFINQ--RDHVNDVLTQPWF 909 

I I M ,111 Mill I :M I I =:: : : MM I 
Db 812 GLLPGQIYRALVAAATSAGVGVASAPVLVQLPFPPAAEPGPEVSEGLAERLAKVLRKPAF 871 

Qy 910 IILLGAILAVL^SFGAMVFVKRKHMMMKQSALNTMRGNHTSDVLKMPSLSARNGNGY-- 967 

: I Ml II :: :M :: M "h MM : I 
Db 872 LAGSSAACGALLLGFCAALYRRQK — QRKELS — HYTASFAYTPAVSFPHSEGLSG 923 

Qy 968 • — WLDSSTGGMVWRPSPGGDSLEMQKDHIADYAPVCGAPGSPAGGGT 1012 

II I MM: Ml 
Db 924 SSSRPPMGLGPAAYPWLADS WPHPPRSPSAQEPR GSCCPSNPDPDDRYY 972 



Qy 1013 SSGG SGGAGSGASG GDDIHGGHGSERNQQRYVGEYSNIPTDYA 1055 

: I 'I: III I::: II : I M 

Db 973 NEAGISLYLAQTARGANASGEGPVYSTIDPVGEELQTFHGG FPQHSSGDPSTWS 1026 

Qy 1056 EVSSFGKAPSEY GRHG NASPAPYATSSILSPHQQQQQ 1092 

: | H: I I I II : : I ::: 

Db 1027 QY APPEWSEGDSGARGGQGKLLGKPVQMPSLSWPEALPPPPPSCELSCPEGPEEE 1081 

Qy 1093 QQ - PRYQQRPVPGYGLQRPMHPHYQQQQH 1120 

: II I II 

Db 1082 LKGSSDLEEWCPPVPEKSHLVGSSSSGACMVAPAPRDTPSPTSSYG 1127 

Qy 1121 QQQQAQQTHQQHQALQQHQQLPPSNI • - YQQMSTTSEIYPTNTGPSRSVYSEQYYYPKDK 1178 

ill- I IMM II : Ml : I 

Db 1128 QQSTATLTPSPPDPPQ PPTDIPHLHQMPRRVPL GPSSPLSVSQPALSSHD 1177 

Qy 1179 QRHIHITENKLSNCH TYEAAPG AKQSSPISSQFASVRRQ 1217 

|: : : : I I MM : : M I M:: 

Db 1178 GRPVGLGAGPVLSYHASPSPVPSTASSAPGRTRQVTGEMTPPLHGHRARIRKKPKALPYR 1237 



Qy 1218 QLPP 1221 

MM 

Db 1238 REHSPGDLPP 1247 



RESULT 11 
Q9VQ07 



PRT; 232 AA, 



Q9VQ07 PRELIMINARY; 
Q9VQ07; 

01-MAY-2000 (TrEMBLrel. 13, Created) 
01-MAY-2QQ0 (TrEMBLrel, 13, Last sequence update) 
01-MAY-2000 (TrEMBLrel. 13, Last annotation update) 
CG5574 PROTEIN. . 
CG5574. 

Drosophila melanogaster (Fruit fly) . 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

Ephydroidea; Drdsophilidae; Drosophila, 

NCBI_TaxID-7227,; 

[1] 

SEQUENCE FROM N.A. 
STRAIN-BERKELEY;- 

MEDLINE-20196006; PubMed-10731132; 

Adams M.D., Celniker S.E., Holt R'.A., Evans C.A., Gocayne J.D., 
Manatides P,G;, Scherer S.E., Li P.W., 'Hoskins R.A., Galle R.F., 
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RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q,, Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA wan K.H., Doyle c, Baxter E.G., Helt 6., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch c, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos p. v., Berman B.P., Bhandari D., Bolshakov S,, 

RA Borkova D., Botchan M.R, , Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H, , Cadieu E., Center A,, Chandra I., 

RA cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Hays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin k.j., Evangelists c.C, Ferraz C, Ferriera S., Fleischmann W., 

RA Fosler c, Gabrielian A.E., Garg H.S., Gelbart W.M., Glasser K., 

RA Glodek A. , Gong F., Gorrell J.H., Gu Z., Guan P., Harris M. ( 

RA Harris N.L., Harvey D., Heiman T.J.,, Hernandez J.R., Houck J., 

RA Hostin D., Houston R.A., Howland T.A, Wei M.-H., Ibegvam C, 

CJalali M, , Kalush F., Karpen G.H., Z., Kennison J. A., Ketchum R.A., 
Kiramel B.E., Kodira CD., Kraft C, 'Kravitz S., Kulp D,, Lai z., 
Lasko P., Lei Y., Levitsky A.A., Li J., Li Z., Liang Y,, Lin X., 

RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., Mcpherson D., 

RA Merkulov G., MilsSiina NX, Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M., Murphy B., Murphy L., Mu2ny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M, , Pittman G.S., Pan S., Pollard J., Purl v., Reese M.G., 

RA Reinert K., Remington K., Saunders R.D.C., Scheeler F., Shen H., 

RA Shue B.C., Siden-Kiaios I., Simpson M., Skupski M.P., Smith T. , 

RA Spier E., spradling A.C., Stapleton M,, Strong R., Sun E., 

RA Svirskas R., lector C, Turner R. , Venter E., Wang A.H., Wang X,, 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J,, 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J., Yen R.-F., Zaveri J.S., Zhan M. , Zhang G., Zhao 0./ Zheng L., 

RA Zheng X.H., zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000), 

DR EMBL; AE003586; AAF51376.1; -. 

DR FLYBASE; FBgnO031338; CG5574. 

SQ SEQUENCE 232 AA; 25580 MW; 8EB530901DEC4EDA CRC64; 



Query Match 16 . 9%; Score 1230; DB 5; Length 232; 

Best Local Similarity 100.0%; Pred, No. 3.8e-79; 

Matches 232; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

C1150 MSTTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIHITENKLSNCHTYEAAPGAKQSSPISS 1209 
IIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 
1 MSTTSEIYPTNTGPSRSVYSEQYYYPKDKQRHIHITENKLSNCHTYEAAPGAKQSSPISS 60 

Qy 1210 QFASVRRQQLPPNCSIGRESARFKVLNTDQGKNQQNLLDLDGSSMCYNGLADSGCGGSPS 1269 

IIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIII 

Db 61 QFASVRRQQLPPNCSIGRESARFKVLNTDQGKNQQNLLDLDGSSMCYNGLADSGCGGSPS 120 
Qy 1270 PMAMLMSHEDEHALYHTADGDLDDMERLYVKVDEQQPPQQQQQLIPLVPQHPAEGHLQSW 1329 

IIIIIIMMIIIIIIIINIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 

Db 121 PMAMLMSHEDEHALYHTADGDLDDMERLYVKVDEQQPPQQQQQLIPLVPQHPAEGHLQSW 180 

Qy 1330 RNQSTRSSRKNGQECIKEPSELIYAPGSVASERSLLSNSGSGTSSQPAGHNV 1381 

1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 ! 1 1 M II j I 

Db 181 RNQSTRSSRKNGQECIKEPSELIYAPGSVASERSLLSNSGSGTSSQPAGHNV 232 



RESULT 12 
Q9VQ09 

ID Q9VQ09 PRELIMINARY; PRT; 166 AA. 
AC Q9VQ09; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 

DE CG14348 PROTEIN. 

GN CG14348. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; insecta; 



OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drbsophilidae; Drosophila. 

OX NCBI TaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BERKELEY'; ■ 

RX MEDLINE-2O196Q06; PubMed»10731132; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA toanatides P.G.r Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N,, 

RA Sutton G.G., WOrban J.R., Yandell M.D., Zhang Q., Chen L.X,, 

RA Brandon R,C, Rogers Y.-H.C, Blazej R.G., Champe M., Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch c, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J,, Bayraktaroglu L,, Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J. ( Brokstein P., Brottier P., 

RA Burtis K.C., Bus'am D.A., Butler H., Cadieu E., Center A., Chandra I,, 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista c.C, Ferraz C, Ferriera S., Fleischmann w., 

RA Fosler C, Gabrielian a.e., Garg N.S., Gelbart W.M., Glasser K., 

RA Glodek A., Gong.F., Gorrell J.H., Gu Z., Guan P., Harris M,, 

RA Harris N.L., Harvey D,, Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A., Howland T.J., Wei M.-H,, Ibegwam C, 

RA Jalali M,, Kalush F., Karpen G.H., Ke z., Kennison J.A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A.A., Li J., Li Z., Liang Y,, Lin X., 

RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M,, Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V,, Reese M.G., 

RA Reinert K., Remington K., Saunders R.D.C, Scheeler F., Shen H., 

RA Shue B.C.Siden-Kiamos I,, Simpson M., Skupski M.P., Smith T,, 

RA Spier E., Spradling A.C, Stapleton M., Strong R., Sun E., 

RA Svirskas R., lector C, Turner R, , Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F,, Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L., 

RA Zheng X.H., zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003586;. AAF51374.1; -. 

DR FLYBASE; FBgn0031340; CG14348. 

DR INTERPRO; IPR001777; -. 

DR PFAM; PF00041; fn3; 1, 

SQ SEQUENCE 166 AA; 18353 MW; 5FDFD7163A17C217 CRC64; 



Query Match < 10.8%; Score 783.5; DB 5; Length 166; 

Best Local Similarity 96.9%; Pred. No. 7.8e-48; 

Matches 156; Conservative 0; Mismatches 0; indels 5; Gaps 1; 

Qy 791 MEALLLNSSAVFLKWKAPELKDRHGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPTLV 850 

MiiiMimiiiMiiiiiiimiMMmiiiiiiiiiiimimiiiiiiii 

Db 1 MEALLLNSSAVFLKWKAPELKDRHGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPTLV 60 

Qy 851 LANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPITKRLDPFINQR DHVNDVLT 905 

llllllllinillMIIIIIIIIIIIIIIIIIIMIIIIIIIIIII llllllll 
Db 61 LANLTEGVMYTVGVAAGNNAGVGPYCVPATLRLDPITKRLDPFINQRYPINQDHVNDVLT 120 

Qy 906 QPWFIILLGAILAVLMLSFGAMVFVKRKHMMMKQSALNTMR 946 

Db 121 QPWFIILLGAILAVLMLSFGAMVFVKRKHMMMKQSALNTMR 161 



RESULT 13 
P91572 

ID P91572 PRELIMINARY; PRT; 423 AA. 
AC P91572; 



Mon Jan 22 13:04:29 2001 



us-09-540-245a-16.rspt 



Page 



01-MAY-1997 (TrEMBLrel. 03, Created) 
01-MAY-1997 (TrEMBLrel. 03, Last sequence update) 
01-MAY-2000 (TrEMBLrel. 13, Last annotation update) 

SIMILAR TO THE IMMUNOGLOBULIN SUPERFAMILY. 
ZK377.3. 

Caenorhabditis elegans. 

Eukaryota; Metazoa; Neraatoda; Chromadorea; Rhabditida; Rhabditoidea; 

Rhabditidae; Peloderinae; Caenorhabditis. 

NCBI_TaxID-6239; 

tl] 

SEQUENCE FROM N.A. 

STRAIN-BRISTOL N2; 

MEDLINE-94150718; PubMed-7906398; 

Wilson R,, Ainscough R., Anderson K,, Baynes C, Berks M,, 

Bonfield J., Burton J,, Connell M., Copsey T., Cooper J., Coulson A,, 

Craxton M. , Dear S., Du Z,, Durbin R., Favello A,, Fulton L., 

Gardner A., Green P., Hawkins T., Hillier L., Jier M., Johnston L., 

Jones M,, Kershaw J., Kirsten J., Laister N., Latreille P., 

Lightning J., Lloyd C, Mcmurray A., Mortimore B., O'Callaghan M., 

Parsons J,, Percy C, Rifken L., Roopra A., Saunders D,, Shownkeen R., 

Smaldon N. , Smith A., Sonnhammer E., Staden R,, Sulston J., 

Thierry-Mieg J., Thomas K., Vaudin M., Vaughan K., Waterston R., 

Watson A., Weinstock L., Wilkinson -Sproat J., Wohldman P.; 

"2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 

qlegans."; 

Nature 368:32-38(1994). 
[2] * 
SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
Nhan M,, Hawkins J.; 

Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 
[3] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
Waterston Re- 
submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases, 
[4] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 

Waterston Re- 
submitted (APR-1997) to the EMBL/GenBank/DDBJ databases. 
EMBL; U88183; AAB52658.1; -. 
HSSP; P56276; 1TLK. 
INTERPRO; IPROO3O06; -. 
PFAM; PF00047; ig; 4. 

SEQUENCE 423 AA; 46544 MW; feB4530DB6BD575E5 CRC64; 



Query Match 
* Best Local similarity 
iMatches 155; Conservative 



9.9%; Sore 717; DB 5; Length 423; 
39.1%; Pr^d. No. 1.7e-42; 

Mismatches 154; Indels 26; 



Db 



4 PRIIEHPMDTTVPKNDPFTFNCQAEGNPTPTIQWFKDGREL— RTDTGSHRIMLPAGGL 60 

I Nil!:! : Mill::! I hill: : I MM I I 
30 PVIIEHPIDWVSRGSPATLNCGAKPS-TAKITWyKDGQPVITNKEQVNSHRIVLDTGSL 88 

61 FFLKVIHSR--RESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANTRVAQGEV 118 

I III : ::MII h! I II I :| : I : : 1 : 1 1 : : 1 1 : I : ||: 
89 FLLKVNSGKNGRDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGEM 148 

119 ALMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWKN 178 

::M III III : I : I : III I :|| I III I 

149 AVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 



Qy 179 WGTRESATAFLKVHVRPFLIRGPQNQTAWGSSWFQCRIGGDPLPDVLWRRTASGGNM 238 

:ll I I I I I :l : I" I ll"l:l II: III I : hi I 
Db 209 MVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKR-RNEPM 266 

Qy 239 PLRKFSWLHSASGRVHVLED-RSLKLDDVTLEDMGEYTCEADNAVGGITATGILTVHAPP 297 

I: I :: :| I I::: I I 1 1 1 I I I I : I : I I 1 1 1 

Db 267 PVT RAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPP 316 

Qy 298 KFVIRPKNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGY-RDGRMEVTLTPE 355 



I :l =1 I I III I I I :ll II II I I III :|: I 
317 SFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPT- 374 

356 GRSVLSIARFAREDSGKWTCNALNAVGSVSSRTW 391 

1:1 I I I I :|: II h : 
375 • -GTLTIEEVRQVDEGAYV-CAGMNSAGSSLSKAAL 407 



RESULT 14 
001632 

ID 001632 PRELIMINARY; PRT; 874 AA. 

AC 001632; 

DT 01-JUL-1997 (TrEMBLrel. 04, Created) 

DT 01-JOL-1997 (TrEMBLrel, 04, Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 

DE CODED FOR BY C- ELEGANS CDNA CEESC12R . 

GN ZK377.2. •< 

OS Caenorhabditis elegans. 

OC Eukaryota; Metaaoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI TaxID-6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL N2; 

RX MEDLINE-94150718; PubMed-7906398; 

RA Wilson R,, Ainscough R,, Anderson K. , Baynes C, Berks M., 

RA Bonfield J., Burton J., Connell M,, Copsey T., Cooper J,, coulson A., 

RA Craxton M., Dear S., Du Z., Durbin R., Favello A., Fulton L., 

RA Gardner A., Green P., Hawkins T., Hillier L., Jier M., Johnston L,, 

RA Jones M., Kershaw J., Kirsten J,, Laister N., Latreille P., 

RA Lightning J., Lloyd C, Mcmurray A., Mortimore B,, O'Callaghan M, , 

RA Parsons J., Percy C, Rifken L., Roopra A., Saunders D., Shownkeen R., 

RA Smaldon N., Smith A., Sonnhammer E., Staden R,, Sulston J,, 

RA Thierry-Mieg J,; Thomas K., Vaudin M., Vaughan K,, Waterston R., 

RA Watson A., weinstock L., Wilkinson -Sproat J., Wohldman P.; 

RT "2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 

RT elegans."; 

RL Nature 368:32-38(1994). 

RN [2] 

RP SEQUENCE FROM NiA, 

RC STRAIN-BRISTOL N2; 

RA Nhan M., Hawkins J.; 

RL Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases, 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL N2; 

RA Waterston R.; 

RL Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL N2; 

RA Waterston R.; > 

RL Submitted (APR-1997) to the EMBL/GenBank/DDBJ databases, 

DR EMBL; U88183; AAB52657.1; -. 

DR HSSP; P56276; 1TLK. 

DR INTERPRO; IPRQ01777; -, 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3, 

DR PFAM; PF00047;ig; 1. 

DR PRINTS; PR00014; FNTYPEIII. 

SQ SEQUENCE 874 AA; 95861 MW; BC72270818D734C9 CRC64; 



Query Match 9.0%; Score 658; DB 5; Length 874; 

Best Local Similarity 22.9%; Pred. No. 7.6e-38; 

Matches 229; Conservative 154; Mismatches 329; Indels 288; Gaps 

Qy 400 PPPIIEQGPVNQTLPVKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALT 455 

III II I Mil I I Mil: I I I :|| Ihlll: ; I : |:| 
Db 28 PPPTIEHGHQtlQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITD-SRISQHSTGSLK £6 

Qy 460 ISDLQRHEDEGLYTCVASNRNGKSSWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQM 513 

hll:: I [Hlhl I :|:|:ll I :: h I :| I |: I :| I :| : 



Best Available Copy 
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Db 87 IADLKK-PDTGVYTCIAKNEDGESTWSASLTVEDHTS-NAQFVRMPDPSNFPSSPIQPII 144 

Qy 520 VEKGENSVTLSWTRSNKVGGSSLVGYVIEMFGKNETDGWVAVGTRVQNTTPTQTGLLPGV 579 

I : I I I : I : Ihh : : I : I :| : II I 
Db 145 VNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGLKPSH 204 

Qy 580 NYFFLIRAENSHGLSLPSPMSEPITVGTRYFNSGL DLSEARASLLSGDWELSN 633 

:| !:lllll I: III: I I:: I I I 

Db 205 SYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEE 264 

Qy 634 ASWDSTSMKLTWQIIN-GKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASA 692 

::|[:::| h : ::|:|: I I :| I : II I 

Db 265 VKTINSTAVRLFWRKRRLEELIDGYYIRWR 6PPRIHDHQYVNVTSPS— 311 

! 

Qy 693 SASASABISTRPNIAMGKRDGETNQSGGGAPTPLNTRYRMLTILNGGGASSCTITGLVQ 752 

: :: |: 

; TENYWSNLMP 322 



Db 312 ■ 



f 753 YTLYEFFIVPFYK— SVEGKPSNSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWRAPE 809 
:| ; |: I Nil Mill: :|| : : : lllh 

Db 323 FTNYEFFjviPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWKAPK 382 

Qy 810 LKDRHGVLLNYHVIVRGIDTAHNFSRILTNVTIDAASPTLVLANLTEGVMYTVGVAAGNN 869 

:|:| : ::: I M :| hi : : :: I :| I: I : III :| 
Db 383 ADGINGILKGFQIVIVG- -QrtPNNNR- - - NITTNERAASVTLFHLVTGMTYKIRVAARSN 437 

Qy 870 AG VG PY | - CVPAT LRLDPI ThRL DPFIN — QRDHVNDVLTQPWFIILLGAILA 918 

III 5 :i I : I I : I: : II :|:: III 

Db 438 GGVGVS^GTSEVIMNQDTLEIKLAAQQENESFLYGLINKSHVP VIVIVAILI 489 

Qy 919 VLMLSFfiAMVFVK*; - - RKHMMMKQSALNTMRGNHTSDVLXMPS L 959 

: :: \\ : : 1 I : : : : : | | : 1 1 : | : : 
Db 490 IPWI IIAYCYWRNSRNSDGKDRSP IKINDGSVH -MASNNLWDVAQNPNQNPMYNTAGRM 548 

Qy 960 SARNGNGYWLDSST GGMVWRPSPGGDSLEMQKDHIADYAPVCGAPGS 1006 

: iM III 1:11 ;| II : I II: 

Db 549 TMNNRNGQ ALY SLT PNAQDFFNNCDD Y SGTMHRPG SEHHYHYAQLTGGPGN 599 

Qy 1007 PAGGGTSSGGSGGAG SGASGGDDI HGGHGSERNQQRYVGEYS NIPTDYAEVSSFGKAPS E 1066 

:|:| 

Db 600 AMSTF 604 

( 

Qy 1067 YGRHGNASPAPYATSSILSPHQQ QQQQQPRYQQRPVPGYGLQRPMHP HY 1115 

II : J 1:1111:::: :|| : : I III III 
Db 605 YG NQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPT NPVP PEPPARYADHT 658 

m 1116 QQQQHQQQQAQQ THQQHQALQQHQQLPPSNI ■ YQQMSTTSEI YPTNTGPSRS 1166 

W ■ -\ I : I: I ::: I |: :: || |: 
Db 659 AGRRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTDVSYVQLHSSD GIGSSKE 713 

Qy '1167 VYSEQYYYPKDKQRHIHITENKL SN CHTYEAAPGA- -RQSSPISS 1209 

1:1 II II 11:1 : hi 

Db 714 RTGERRTPP NRTLMDFIPPPPSNPPPPGGHVYDTATRRQLNRGSTPRED 762 

Qy 1210 QFAS VRRQQLPPNCSIGRESARFKVLNTDQGKNQQNLLDLDGSSMCYNGL 1259 

: I I : I ::| : I I : ::| II I :| 

Db 763 T YDSVSDGAFARVDVNARPTSRNRN1GGRPLKGK - * RDDDSQRSSLMMDDDGGSSEADGE 820 

Qy 1260 ADSG CGGSPSPMAMLMSHEDEHALYHT 1286 

I Ml:! I: I I 

Db 821 NSEGDVPRGGVRKAVPRMGISASTLA HSCYGT 852 



RESULT 15 
Q62845 
ID Q62845 
AC $62845; 
DT 01-NOV-1996 (TrEMJJLrel. 01, Created) 
DT 01-NOV-1996 (TrEMBLrel. 01, Last sequence update) 
DT 01-MAY-2000 (TrEMBLrel. 13, Last annotation update) 

DE NEURAL CELL ADHESION PROTEIN BIG-2 PRECURSOR. 
GN BIG-2. 

OS Rattus norvegicus (Rat) , 



PRELIMINARY; PRT; 1026 AA. 



OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

OX NCBI_TaxID-10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-WISTAR; TISSUE-BRAIN; 

RX MEDLINE-96025317; PubMed-8586965 

RA Yoshihara Y., Kawasaki M. , Tamada A., Nagata S., Kagamiyama H,, 

RA MoriK.; 

RT "Overlapping and differential expression of BIG-2, BIG-1, TAG-1, and 

RT F3: four members of an axon -associated cell adhesion molecule subgroup 

RT of the immunoglobulin superfamily."; 

RL J. Neurobiol. 28:51-69(1995). 

DR EMBL; 035371; AAC52262.1; -. 

DR HSSP; P20241; 1CFB. 

DR INTERPRO; IPR001777; ■. 

DR INTERPRO; IPR003006; ■. 

DR PFAM; PF00041; fn3; 4. 

DR PFAM; PF00047; ig; 6. 

DR PRINTS; PR00014; FNTYPEIII, 

KW Signal. 

FT SIGNAL i 18 

FT CHAIN 19 1026 



POTENTIAL. 

NEURAL CELL ADHESION PROTEIN BIG-2. 



SQ SEQUENCE 1026 AA; 113393 MW; B60C0EEFA17BE3CC CRC64; 



Query Match • 8.3%; Score 603; DB 11; Length 1026; 

Best Local Similarity 23.4%; Pred. No. 7.5e-34; 

Matches 275; Conservative 164; Mismatches 386; Indels 348; Gaps 45; 

Qy 2 ENPRI IEHPMDTTVPKNDPFTFNCQAEGNPT PT IQWFKDG RELKTDTG - SHR IMLPAGGL 50 

: I : 1:1: I :|: :||| I |:| :| :: I I I : I I 
Db 30 QEPSHVMFPLDSEERK- --VKLSCEVKGNPKPHIRWRLNGTDV- -DIGMDFRYSWEGSL 84 

Qy 61 FFLPIHSRRESDAGTYWCEAKNEFGVARSRNATLQVAVLRDEFRLEPANT - RVAQGEVA 119 

: : : hill Mill II I II I I : h :| Mh 
Db 85 L - - • INNPNKTQDSGT YQC I ATNSFGT IVSREAKLQFAYL - ENFKTRTRSTVS VRRGQGM 140 

|Qy 120 LMECGAPRGSPEPQISWRKNGQTLNLVGNKRIRIVDGGNLAIQEARQSDDGRYQCWKNV 179 
1 :: III M :| M : hi : III I ; ::| MINI 

Db 141 VLLCGPPPHSGELSYAWIFN-EHPSYQDNRRFVSQETGNLYIAKVEKADVGNYTCV7TNT 199 

Qy 180 VGTRESATAFLKVHVRPFLIRG PQNQTAWGSSWFQCRIGGDPL 224 

M : : I ::| hi Ihl :| hh 

Db 200 VTSHQ VLGPPTPLILSNDGVMGEYEPKIEVQFPETVPAEKGSTVKLECFALGNPV 254 

Qy 225 PDVLWRRTASGGNMPLRKFSWLHSASGRVHVLEDRSLKLDDVTLEDMGEYTCEADNAVGG 284 

I :IM I j I: : : I :ll h: : III I I hh I 

Db 255 PTILWRR-ADG- -KPIARKARRHKSSG ILEIPNFQQEDAGSYECVAENSRGR 303 

Qy 285 ITATGILTVHAPPRFVIRPRNQLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGYR 344 

II :l :| I :| : I : : I :lhlll hll I h I 
Db 304 NIAKGQVTFYAQPNWVQIINDIHVAMEESVFWECRANGRPRPTYRWLKNGD PLLT 358 

Qy 345 DGRMEVTLTPPGRSVLSIARFAREDSGKWTCNALNAVGSVSSRTWSVDTQFELPPPII 404 

I::: ' : hi hi : I II M : :|| I 
Db 359 RERIQIE— --QGTLNITIVNLSDAG-MYQCVAENKHGVIYASAELSV IA 403 

Qy 405 EQGPVNQTL- PVKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGA 457 

I ::ll * I lh h :l I :| II ::|:|| :h I 
Db 404 ESPDFSRTLLRRVT LVRVGGEWIECKPRASPRPVYTW ■ RKGREI • LRENERIT ISEDGN 461 

Qy 458 LTISDLQRHEDEGLYTCVASNRNGKSSWSGYLRLDTPTNPNI 499 

I I :: : 1 I llhhl I :| :| s : II : 
Db 462 LRIINVTR-SPAGSYTCIATNHFGTASSTGNVWRDPTKVMVPPSSMDVTVGESIVLPCQ 520 

Qy 500 1 499 

Db 521 VTHDHSLDIVFTWTFNGHLIDFDRDGDHFERVGGQDSAGDLMIRNIQLRHAGRYVCMVQT 580 

Qy 500 - - - KFFRAPELST Y PGP PGKPQMV - - -ERGENSVTLSWTRSNRVGGSSLVGYVIEM- - -F 550 

I I : Ml IM I : : MM I : Mh I 
Db 581 SVDKLSAAADL-IVRGPPGPPEAVTIDEITDTTAQLSW-RPGPDNHSPITMYVIQARTPF 638 
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551 GKNETDGWVAVGT- - -RVQNTTFTQT ■ -GLLPGVNYFFLIRAENSHGLSLPSPMSEPITV 605 

: II IN I III I II I I I I I I I: Mil 
639 SVGWQAVSTVPELVDGKTFTATVVGLNPWVEYEFRT VAANVIGIGEPSRPSE 690 

606 GTRYFNSG L - DLSEARASLLSGDVVELSNASVVDSTSMKLTWQI I NGK---YVEG 656 

I I ::: I I I II :l|: : II: II 

691 -KRRTEEALPEVTPANVSGGGGSKSEL VITWETVPEELQNGRGFGYWA 738 

657 FMARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASASASALI— -STKPNIAAAGKR 712 

I : : : : III II : I :| 

739 FRPHGKMI WMLTVLASADASRYVFRNESVRP 769 



713 DGETNQSGGGAPTPLNTRYRMLIILNGGGASSCTITGLVOYTLYEFFIVPFYKSVEGKPS 772 

:: :| : I III 
770 FSPFEVKVGVFNNKGEGPFS 789 

773 NSRIARTLEDVPSEAPYGMEALLLNSSAVFLKWKAPELKDRHGVLLNYHV-IVRGIDTAH 831 

: : : I: I:: I =1 I::: : : I :| hi I : I I II 
790 PTTLVYSAEEEPTKPPASIFARSLSATDIEVFWASPIGKNR-GRIQGYEVKYWRHDDREE 848 

832 NFSRILT--NVTIDAASPTLVLANLIEGVMYTVGVAAGNNAGVGPYCVPATLRLDPITKR 889 

I :| I I I : : II :| : I I hll I |:: I:: 
849 NARKIRTVGNQT STKITNLKGNALYHLSVKAYNSAGTG — PSSAAVNVTTRK 898 

890 LDPF I NQRDHVNDVLTQPWF I ILLGAILAVLMLSPGAMVPV KRKHMMMKQS 940 

I :|| I: : : ::|" : : I : :|| 

899 PPP SQPPGNIIWNSSDSKIILNWDQVKALDNESEVKGYKVLYRWNRQS 946 

941 ALNTMRGNHTSDVLKMPSLSARNGNGYWLDSSTGGMVWRPSPGGDSLEMQKDHIADYAPV 1000 

: : : I II I :| :|:| : I 

947 STSVI ETNKTSVELSLP FDEDYIIEIKPF 975 



1001 CGAPGSPAGGGTSS — 

I I hll I Mill 

976 SDGGDGSSSEQIRIPKISNSYARGSGAS 1003 



Search completed: January 22, 2001, 12:52:C 
Job time: 1921 sec 
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GenCore version 4.5 
Copyright (c) 1993 • 2000 



OM protein - protein search, using sw model 



Title: 

Perfect score: 
Sequence: 

Scoring table; 



January 22, 2001, 12:17:58 ; Search time 233.01 Seconds 
(without alignments) 
190.332 Million cell updates/sec 

US-09-540-245A-17 
6860 

1 MYYLGFYHTHTHTHTYINFD TAQRFRS IPRNKG I VTQEQT 1297 

BLOSUM62 

Gapop 10.0 , Gapext 0,5 



Qarched: 268485 seqs, 34193795 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



/SIDSl/gcgdata/geneseq/geneseqp/AA1980,DAI:* 
/SIDSl/gcgdata/geneseq/geneseqp/AAl981 .DAT : * 
/SiDSl/gcgdata/geneseq/geneseqp/AAl982.DAT:* 
/SIDSl/gcgdata/geneseq/geneseqp/AAl983 .DAT : * 
/SIDSl/gcgdata/geneseq/geneseqp/AAl984 , DAT : * 
/$DSl/gcgdata/geneseq/geneseqp/AA1985 , DAT : * 
/SIDSl/gcgdata/geneseq/geneseqp/AA1986 .DAT:* 
/SIDSl/gcgdata/geneseq/geneseqp/AA1987 .DAT : * 
/SIDSl/gcgdata/geneseq/geneseqp/AAl988 .DAT : * 
/SIDSl/gcgdata/geneseq/geneseqp/AA1989 .DAT : * 
/SIDSl/gcgdata/geneseq/geneseqp/AA1990.DAT:* 
/S IDSl/gcgdata/geneseq/geneseqp/AAl 991. DAT : * 
/S IDS 1/gcgda ta/genes eq/geneseqp/AAl 992 .DAT : * 
/S IDS1/ gcgdata/ genes eq/geneseqp/AAl 9 9 3 . DAT : * 
/SIDSl/ gcgdata/genes eq/geneseqp/AAl 9 94 . DAT : * 
/SlDSl/gcgdata/geneseq/geneseqp/AA1995.DAT:* 
/SIDSl/gcgdata/geneseq/geneseqp/AA1996 .DAT :* 
/SIDSl/gcgdata/geneseq/geneseqp/AA1997 .DAT :* 
/S IDSl/gcgda ta/genes eq/geneseqp/AAl 998 .DAT:* 
/SiDSl/gcgdata/geneseq/geneseqp/AA1999 . DAT : * 
.DAT:* 



Pred. No. is the number of results predicted 'by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



Query 



NO, 


Score 


Match Length 


DB 


ID 


Description 


PR 


1 


6860 


100.0 


1297 


20 


Y13565 


C. elegans Robo po 


XX 
PA 


2 


6860 


100.0 


1297 


20 


Y08403 


C. elegans ROBO pr 


XX 


3 


1588 


23,1 


1395 


20 


Y13563 


Drosophila Robo 1 


PI 


■ 4 


1588 


23,1 


1395 


20 


Y08401 


Drosophila sp. ROB 


XX 


5 


1500,5 


21.9 


1651 


20 


Y13566 


Human Robo 1 polyp 


DR 


6 


1489,5 


21,7 


1649 


20 


Y08404 


Human ROBOl protei 


DR 


7 


1350 


19.7 


1380 


20 


Y08402 


Drosophila sp. ROB 


XX 


8 


1344,5 


19.6 


1381 


20 


Y13564 


Drosophila Robo 2 


PT 


9 


1266 


18.5 


753 


20 


W83927 


Human T85 protein. 


XX 


10 


600,5 


8,8 


1018 


18 


K06485 


Rat contactin liga 


PS 


11 


600 


8.7 


1571 


19 


K42087 


Human Down syndrom 


XX 


12 


598 


8.7 


1910 


19 


N42086 


Human Down syndrom 


cc 





593 £ 


,5 1018 


15 


R63759 


Human contactin {: 


14 


593 f 


.5 1018 


17 


R87028 


Human contacti;.. 




583.5 £ 


.5 1257 


20 


W74152 


Human Ll cell 


16 


583 ! 


.5 1028 


19 


W29667 


Homo sapiens DL1°5 


17 


580.5 £ 


.5 1225 


19 


W52289 


Homo sapiens cdo t 


18 


561.5 £ 


.2 434 


20 


Y13567 


Human Robo 2 polyp 


19 


561.5 ! 


.2 434 


20 


Y08405 


Human partial ROBO 


20 


554 £ 


.1 1447 


16 


R68553 


Deleted in colore: 


21 


554 £ 


,1 1447 


20 


Y33498 


Human DCC protein. 


22 


554 £ 


.1 1728 


12 


R13144 


Deleted in Colorec 


11 


546.5 f 


,0 1242 


19 


W52287 


Rattus norvegicus 






.9 1192 


19 


W57900 


Protein of clone C 






.9' 1299 


21 


Y40439 


Human Nr-CAM prote 


26 


526 7 


.7 1897 


21 


Y81785 


Human protein tyro 




526 7 


.7 1897 


21 


Y56100 


LAR tyrosine phosp 


28 


524,5 7 


,5 ' 1304 


19 


W59994 


Human neural cell 


29 


518,5 7 


.6 1251 


19 


W37778 


Rattus norvegicus 


30 


510.5 7 


.4 1496 


20 


W81030 


Melanoma associate 


31 


510.5 7 


.4 1496 


21 


Y70469 


Human p53 target m 


32 


508.5 7 


,4 4412 


21 


Y53666 


Sequence gi/101742 


33 


501.5 7 


.3 3117 


21 


Y53667 


Sequence gi/332818 


34 


495 7 


.2 1911 


16 


R71726 


Human PTP-OB, Horn 


35 


495 7 


.2 1911 


18 


W27225 


Human protein tyro 


36 


495 7 


.2 1911 


20 


W94027 


Human protein tyro 


37 






19 


W37779 


Rattus norvegicus 


38 
39 


485 7 


!l 1125 


19 


W52288 


Rattus norvegicus 


465 6 


.3 1501 


16 


R72858 


Rat receptor type- 


40 


441.5 6 


.4 1070 


18 


W08747 


Human colon carcin 


41 


439 6 


.4 761 


17 


R92255 


Neural cell adhesi 


42 


434 6 


,3 1853 


21 


Y53668 


Protein 608 sequen 


43 


434 6 


,3 2387 


21 


Y53665 


Mechanical stress 


44 


434 6 


.3 2597 


21 


Y53664 


Mechanical stress 


45 


432 6 


.3 848 


21 


Y88565 


Human NCAM 140kD i 



RESULT 1 
Y13565 

ID Y13565 standard; Protein; 1297 AA. 
XX 

AC Y13565; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE C. elegans Robo polypeptde. 
XX 

KW Comm polypeptide; Robo polypeptide; commissureless; roundabout- 
ly modulation^ .nerve cell function. 
XX . 
OS Caenorhabditis elegans. 

XX 

PN B09925833-A1. 
XX 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; 98WO-US24327 . 
XX 

14-NOV-1997; 97US-0065543. 



C, Tear G; 



WPI; 1999-338008/28. 
N-PSDB; X55769. 



CC The invention relates to a method for modulating the amount of Com 
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CC (commissureless) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Comm polypeptide in contact with the cell, where the 

CC amount of expressed active Hobo is specifically modulated inversely with 

CC the modulation of the effective amount of Coram in contact with the cell. 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo: Comm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function. 

XX 

SQ Sequence 1297 AA; 



Query Match 100.0%; Score 6860; DB 20; Length 1297; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1297; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 1 MYYLGFYHTHTHTHTYINFDKIPNASNLAPVIIEHPIDVWSRGSPATLNCGAKPSTAKI 60 

Db 1 myylgfyhthththtyinfdkipnasnlapviiehpidvwsrgspatlncgakpstaki 60 

£ 61 TWYKDGQPVITNKEQVNSHRIVLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGMS 120 

Db 61 twykdgqpvitnkeqvnshrlvldtgslfllkvnsgkngkdsdagayycvasnehgevks 120 

Qy 121 NEGSLKLAMLREDFRVRPRTVQALGGEMAVLECSPPRGFPEPWSWRKDDKELRIQDMPR 180 

Db 121 negslklamlredfrvrprtvqalggemavlecspprgfpepwswrkddkelriqdmpr 180 

Qy 181 YTLHSDGNLIIDPVDRSDSGTYQCVANNMVGERVSNPARLSVFEKPRFEQEPKDMTVDVG 240 

Db 181 ytlhsdgnliidpvdrsdsgtyqcvannmvgervsnparlsvfekpkfeqepkdmtvdvg 240 

Qy 241 AAVLFDCRVTGDPQPQITWKRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNP 300 

Db 041 aavlfdcrvtgdpqpqitwkrknepmpvtrayiakdnrglriervqpsdegeyvcyarnp 300 

Qy 301 AGTLEASAHLjteAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFP 360 



Db 301 agtleasahlijvqappsfqtkpadqsvpaggtatfectlvgqpspayfwskegqqdllfp 360 

Qy 361 SYVSftDGRT K^PTGTLT I EEVRQVDEGAYVCAGMNSAGSSLS KMLKATFET KGRVQKK 420 

Db 361 syvsadgrtkvsptgtltieevrqvdegayvcapnsagsslskaalkatfetkgrvqkk 420 

Qy 421 KSKMGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKP 480 

Db 421 kskmgkqkqknvqsiikylisavtgntpakppptiehghqnqtlmvgssailpcqasgkp 480 

A 481 TPGISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTV 540 

Wb 481 tpgiswlrdglpiditdsrisqhstgslhiadlkkpdtgvytciaknedgestwsasltv 540 

Qy 541 EDHTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYUQYYSP 600 

Db 541 edhtsnaqfvrmpdpsnfpssptqpiivnvtdtevelhwnapstsgagpitgyiiqyysp 600 

Qy 601 DLGQTWFNIPDYVASTEYRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQV 660 

Db 601 dlgqtwfnipdyvasteyrikglkpshsymfviraenekgigtpsvssalvttskpaaqv 660 

Qy 661 ALSDRNKMDMAIAEKRLTSEQLIKLEEVKTINSTAVRLFWKKRKLEELIDGYYIKWRGPP 720 

Db 661 alsdknkmdmaiaekrltseqlikleevktinstavrlfwkkrkleelidgyyikwrgpp 720 

Qy 721 RTNDNQYVNVTSPSTENYWSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPP 780 

Db 721 rtndnqyvnvtspstenyvvsnlmpftnyeffvipyhsgvhsihgapsnsmdvltaeapp 780 

Qy 781 SLPPEDVRIRMLNLTTLRISWKAPKADGINGILKGFQIVIVGQAPNNNRNITTNERAASV 840 

Db 781 slppedvrirmlnlttlriswkapkadgingilkgfqivivgqapnnnrnittneraasv 840 



Qy 841 TLFBLVTGMTYKIRVAARSNGGVGVSHGTSEVIMNQDTLEKHLAAQQENESFLYGLINKS 900 

Db 841 tlfhlvtgmtykirvaarsnggvgvshgtsevimnqdtlekhlaaqqenesflyglinks 900 

Qy 901 HVPVIVIVAILIIFVVIIIAYCYWRtlSRNSDGKDRSFIKINDGSVHMASNNLWDVAQNPN 960 

Db 901 hvpvivivailiifwiiiaycyvrnsrnsdgkdrsfikindgsvhmasnnlwdvaqnpn 960 

Qy 961 QNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTGGP 1020 

Db 961 qnpmyntagrmtmnnrngqalysltpnaqdffnncddysgtmhrpgsehhyhyaqltggp 1020 

Qy 1021 GNAMSTFYGNQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPTNPVPPEPPARYADH 1080 



Db 1021 gnamstfygnqyhddpspyatttlvlsnqqpawlndkmlrapamptnpvppepparyadh 1080 

Qy 1081 TAGRRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSRERTGE 1140 

Db 1081 tagrrsrssrasdgrgtlngglhhrtsgsqrsdspphtdvsyvqlhssdgtgsskertge 1140 

Qy 1141 RRTPPHRTLMI)FIPPPPSNPPPPGGHVYDTATRRQLNRGSTPREDTYDSVSDGAFARVDV 1200 

Db 1141 rrtppnktlmdfippppsnppppgghvydtatrrqlnrgstpredtydsvsdgafarvdv 1200 

Qy 1201 NARPTSRNRNLGGRPLKGKRDDDSQRSSLMMDDDGGSSEADGENSEGDVPRGGVRKAVPR 1260 

Db 1201 narptsrnrnlggrplkgkrdddsqrsslmmdddggsseadgensegdvprggvrkavpr 1260 

Qy 1261 MGISASTLAKCYGTNGTAQRFRSIPRNNGIVTQEQT 1297 

Db 1261 mgisastlahscygtngtaqrfrsiprnngivtqeqt 1297 



RESULT 2 
Y08403 

ID Y08403 standard; Protein; 1297 AA. 
XX 

AC Y08403; 
XX 

DT 24-JUL-1999 (first entry) 
XX 

DE C. elegans E 
XX 
KW 
KW 
XX 

OS Caenorhabditis elegans. 

XX 

PN KO9920764-A1. , 
XX 

PD 29-APR-1999. 
XX 

PF 20-OCT-1998; 98WO-US22164. 
XX 

PR 14-NOV-1997; ' 97US-097U72. 
PR 20-QCT-1997; 97DS-0062921. 
XX 

PA (REGC ) ONIV CALIFORNIA. 
XX 

PI Goodman CS, Ki'dd T, Mitchell KJ, Tear G; 

XX 

DR WPI; 1999-312615/26. 

DR N-PSDB; X57252. 
XX 

PT Robo polypeptides, a new immunoglobulin superfamily member 
XX 

PS Claim 1; Page 59-63; 80pp; English. 
XX ' 

CC This invention describes novel Robo (roundabout) polypeptides, involved 

CC in nerve guidance which neve been isolated from Drosophila sp., 

CC C. elegans, human and murine samples. The products of the invention can 

CC be used to raise. anti-Robo antibodies, which can be used to modulate cell 

CC function or morphology. The Robo polynucleotides and fragments are useful 



ROBQl; R0BO2; roundabout; nerve guidance; human; murine; cell function; 
cell morphology; screening assay. 
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CC as probes and primers and for production of the Robo polypeptides. The 

CC probes and primers are also useful in screening assays. 

XX I 

SO Sequence 1297 A^; 



Query Match 100.0%; Score 6860; DB 20; Length 1297; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 1297; Conservative 0; Mismatches 0; Indels 0; 



Qy 


1 MYYLGFYHTHTHTHTYINFDKIPNASNLAPVIIEHPIDVWSRGSPATLNCGAKPSTAKI 


60 


Db 


1 myylgfyhthththtyinfdkipnasnlapviiehpidvwsrgspatlncgakpstaki 


60 


Qy 


61 


TWYKDGOPVITNKEQVNSHRIVLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKS 


120 


Db 


61 


twykdgqpvitnkeqvnshrivldtgslfllkvnsgkngkdsdagayycvasnehgevks 


120 


1 


121 


NEGSLKLAMLREDFRVRPRTVQALGGEMAVLECSPPRGFPEPWSWRKDDRELRIQDMPR 


180 


w 

Db 


121 


negslklamlredfrvrprtvqalggemaviecspprgfpepvvswrkddkelriqdmpr 


180 


Qy 


181 


YTLHSDGNLIIDPVDRSDSGTYQCVAOTMVGERVSNPARLSVFEKPKFEQEPKDMTVDVG 


240 


Db 


181 


ytlhsdgnliidpvdrsdsgtyqcvannmvgervsnparlsvfekpkfeqepkdnitvdvg 


240 


Qy 


241 


MVLFDCRVTGDPQPQITWKRKNEPMPVTMIAKDNRGLRIERVQPSDEGEYVCYARNP 


300 


Db 


241 


aavlfdcrvtgdpqpqitwkrknepmpvtrayiakdnrglriervqpsdegeyvcyarnp 


300 


Qy 


301 


AGTLEASAHLRVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFP 


360 


Db 


301 


agtleasahlrvqappsfqtkpadqsvpaggtatfectlvgqpspayfwskegqqdllfp 


360 


Qy 


361 


SYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQKK 


420 


Db 


361 


syvsadgrtkvsptgtltieevrqvdegayvcagmnsagsslskaalkatfetkgrvqkk 


420 


Qy 


421 


KSKMGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKP 


480 


Db 


421 


kskmgkqkqknvqsiikylisavtgntpakppptiehghqnqtlmvgssailpcqasgkp 


480 


Qy 


481 


TPGISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTV 


540 


Db 


481 


tpgiswlrdglpiditdsrisqhstgslhiadlkkpdtgvytciaknedgestwsasltv 


540 


1 


541 


EDHTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSP 


600 


1 


541 


edhtsnaqfvrmpdpsnfpssptqpiivnvtdtevelhwnapstsgagpitgyiiqyysp 


600 


Qy 


601 


DLGQTWFNIPDYVASTEYRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQV 660 


Db 


601 


dlgqtwfnipdyvasteyrikglkpshsymfviraenekgigtpsvssalvttskpaaqv 


660 


Qy 


661 


ALSDKNKMDMAIAEKRLTSEQLIKLEEVKIINSTAVRLFWKKRKLEELIDGYYIKWRGPP 


720 


Db 


661 


alsdknkmdmaiaekrltseqlikleevktinstavrlfwkkrkleelidgyyikwrgpp 


720 


Qy 


721 


RTNDNQYVNVTSPSTENYWSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPP 


780 


Db 


721 


rtndnqyvnvtspstenyvvsnlmpftnyefivipyhsgvhsihgapsnsmdvitaeapp 


780 


Qy 


781 


SLPPEDVRIRMLNLITLRISWKAPKADGINGILKGFQIVIVGQAPNNNRNITTNERMSV 


840 


Db 


781 


slppedvrirmlnlttlriswkapkadgingilkgfqivivgqapnnnrnittneraasv 


840 


Qy 


841 


TLFHLVTGMTYKIRVAARSNGGVGVSHGISEVIMNQDTLEKHLAAQQENESFLYGLIMS 


900 


Db 


841 


tlfhlvtgmtykirvaarsnggvgvshgtsevimnqdtlekhlaaqqenesflyglinks 


900 


Qy 


901 


HVPVIVIVAILIIFWIIIAYCYWRNSRNSDGKDRSFIKINDGSVHMASNNLWDVAQNPN 


960 


Db 


901 


hvpvivivailiifwiiiaycywrnsrnsdgkdrsfikindgsvhmasnnlwdvaqnpn 


960 



Qy 961 QNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFMCDDYSGTMHRPGSEHHYHYAQLTGGP 1020 

Db 961 qnpmyntagrmtmnnrngqalysltpnaqdffnncddysgtmhrpgsehhyhyaqltggp 1020 

Qy 1021 GNAMSTFYGNQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPTNPVPPEPPARYADH 1080 

Db 1021 gnamstfygnqyhddpspyatttlvlsnqqpawlndkmlrapamptnpvppepparyadh 1080 

Qy 1081 TAGRRSRSSRASDGRGTLNGGLHHRISGSQRSDSPPHTDVSYVQLHSSDGTGSSKERTGE 1140 

Db 1081 tagrrsrssrasdgrgtlngglhhrtsgsqrsdspphtdvsyvqlhssdgtgsskertge 1140 

Qy 1141 RRTPPNKTLMDFIPPPPSNPPPPGGHVYDTATRRQLNRGSTPREDTYDSVSDGAFARVDV 1200 

Db 1141 rrtppnktlm'dfippppsnppppgghvydtatrrqlnrgstpredtydsvsdgafarvdv 1200 

Qy 1201 NARPTSRNRNLGGRPLRGKRDDDSQRSSLMMDDDGGSSEADGENSEGDVPRGGVRKAVPR 1260 

Db 1201 narptsrnrnlggrplkgkrdddsqrsslmmdddggsseadgensegdvprggvrkavpr 1260 

Qy 1261 MGISASTLAHSCYGTNGTAQRFRSIPRNNGIVTQEQT 1297 

Db 1261 mglsastlahscygtngtaqrfrsiprnngivtqeqt 1297 

RESOLT 3 
Y13563 

ID Y13563 standard; Protein; 1395 AA. 
XX 

AC Y13563; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE Drosophila Robo 1 polypeptde. 

XX 

KW Comm polypeptide; Robo polypeptide; commlssureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Drosophila sp, 

XX 

PN W09925833-A1. ' 

PD 27-MAY-1999. > 
XX 

PF 13-NOV-1998; 98WO-CS24327. 

n 

PR 14-NOV-1997; 97DS -0065543 . 

xx ; 

PA (REGC ) UNIV CALIFORNIA. 

XX < 

PI Goodman C, Kid- T, Mitchell KJ, Russell C, Tear G; 

XX > 

DR WPI; 1999-338008/28. 

DR N-PSDB; X55767.' 

XX »' 

PT Modulation of Robo-Comm polypeptide interactions 

XX 

PS Disclosure; Page 30-33; 56pp; English. 
XX 

CC The invention relates to a method for modulating the amount of Comm 

CC (commlssureless) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Comm polypeptide in contact with the cell, where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Comm in contact with the cell. 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo:Comm 

CC interactions . This is particularly useful for modulating nerve cell 

CC function. 
XX 

SQ Sequence 1395 AA; 



Best Available Copy 
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Query Match 23.1%; Score 1588; DB 20; Length 1395; 

Best Local Similarity 31.0%; Pred. No. 2. le-89; 

Matches 421; Conservative 195; Mismatches 527; Indels 216; Gaps 41; 

Qy , 29 APVnEHPIDWVSRGSPATLNC--GMPSTAKnWYKDGQPVITNKEQVNSHRIVLDTG 86 

M IMI IMI : MM || I MUM ||::: |||: I 
Db 55 spriiehptdlv^knepatlnckvegkpept-iewfkdgepvstnelck"Shrvqfkd5 111 

Qy 87 SLFLLKVNSGKNGKDSDAGAnCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGG 146 

:H : II I: I I hill I I: I ll::|:||:|||| |: : I 
Db 112 alffyrtmqgk-keqdggeywcvaknrvgqavsrhaslqiavlrddfrvepkdtrvakg 169 

Qy 147 EMAVLECSPPRGFPEPWSWRKDD KELRIQDMPRYTLHSDGNLI IDPVDRSDSG 200 

I hill 1:1 III : I II I : I : 1 1 1 : 1 |: I I 
Db 170 etallecgppkgipeptliwikdgvplddlkamsfgassrvrivdggnllisnvepideg 229 

Qy 201 TYQCVANNMVGERVSNPARLSVFERPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWK 260 

1:1: 1:1 I |: |:| I I I :|||| : | IMI |:: II 
Db 230 nykciaqnlvgtressyaklivqvkpyfmkepkdqvmlygqtatfhcsvggdpppkvlwk 289 

fc/ 261 RKHEPMPVTRAIIAKDNRGLRIERVQPSDEGEYVClfARNPAGTLEASAHLRVQAPPSFQT 320 
W :IM I I : I I : Ml! Ml MINI MM 

"b 290 keegnipvsrarilhdeksleisnitptdegtyvceahnnvgqisaraslivhappnftk 349 

Qy 321 KPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSrVSADGRTKVSPTGTLTIE 380 

:M: II I I I I: I M 1 II |:||: |: || |: IN I 
Db 350 rpsnkkvglngwqlpcmasgnpppsvfwtkegvstlmfpn-sshgrqyvaadgtlqit 407 

Qy 381 EVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQKKKSKMGKQKQKNVQSIIKYLI 440 

MM III III: : II : I: : 
Db 408 dvrqedegyyvcsafswdsstvrvflqvs 437 

Qy 441 SAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRI 500 

: Ml I: I MM II I 111:1:1 M MM: :| 
Db 438 svderpppiiqigpanqtlpkgsvatlpcratgnpsprikwfhdghavq-agnry 491 

Qy 501 SQHSTGSLHIADLKKPDTGVYTCIAKNEDGESIWSASLTVEDHTSNAQFVRMPDPSNFPS 560 

I II : II: 1:1 III I I IMMMI! Ml Ml M 
Db 492 siiqgsslrvddlqlsdsgtytctasgergetswaatltvekpgsts-lhraadpstypa 550 

Qy 561 SPTQPIIVNVTDTEVELHW--NAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEY 618 

I I :M: MM: III II MM! I I |: 

Db 551 ppgtpkvlnvsrtsislrwaksqekpgavgpiigytveyfspdlqtgwivaahrvgdtqv 610 

Qy 619 RIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLT 678 

I M I ll:|::MM Ml I MM: | :: :: | I 
Db 611 tisgltpgtsyvflvraentqgtsvpgglsnviktieadfdaasan — dlsaartllt 666 

Iy 679 SEQLIKLEEVKT I NSTAVRLFWKKRKL - -EELIDGYYIKWRGPPRTNDNQY- •VNVTSPS 734 
I : :M : IMMMI I |: "I I :: II : I I 

b 667 gks-velidasainasavrlewmlhvsadekyveglrihyk-dasvpsaqyhsitvmdas 724 

Qy 735 TENYWSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNL 794 

:M II : MM: |: : I MM II I II IMM I I 
Db 725 aesfvvgnlkkytkyeffltpf---fetiegqpsnsktaltyedvpsappdniqigmynq 781 

Qy 795 TTLRISWKAPKADGINGILKGFQI-VIVGQAPNNNRNITTNERAASVTLFHLVTGMTYKI 853 

I MM III M II Ml I II Ml II I : 
Db 782 tagwvrwtpppsqhhngnlygykievsagntmkvlanmtlnatttsvllnnlttgavysv 841 

Qy 854 RVAARSNGGVGVSHGTSEVIMNQDTLERHLA AQQENESFLYGLINKSHVP ■ 903 

I: : : I I : |: I I M : I I ::| 

Db 842 rlnsftkagdgpyskpislfmd-pthhvhpprahpsgthdgrhegqdltyh-nngnipp 898 

Qy 904 VIVIVAILIIFWIIIAYCYWRNSRNSDGKDRSFIKIND 942 

|:| : :|:: : |: |:: : ::| 
Db 899 gdiDptthkkttdylsgpwlravlvcivllvlvisaaismvyfkrkhqmtkelghlsvvsd 958 

Qy 943 GSVHMASNN — LWDVAQNPNQNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDY 998 

: : I II : : : II : : I I ::| Ml I 
Db 959 neitalninskeslw idhhrgwrtadtdkdsglseskllshvnssqsnynnsd-- 1011 



Qy 999 SGTMHRPGSEHHYHYAQLTGGPGNAMSTFYG-NQYHDDPSPYATTTLV 1045 

II II:: ::||| : IMMIMI :: 

Db 1012 ggt '--dyaev---dtrnlttfyncrkspdnptpyattmiigtsssetctktt 1058 

Qy 1046 1 LSNQQPA- -WLNDKMLRAPAMPTN PVPPE--PPARYAD 1079 

MM: I: II I I III M 
Db 1059 sisadkdsgthspysdafagqvpavpvvksnylqypvepinwseflppppehpppsstyg 1118 

Qy 1080 HTAG - - RRSRSSRASDGRG TLNGGLHHRTSGSQRS DSPPHTDVSY 1122 

: I M II I MM Ml : II I 

Db 1119 yaqgspessrkssksagsgistnqsilnasihssssggfsawgvspqyavacppenvysn 1178 

Qy 1123 VQLHSSDGTGSSKERTGERRTPPNKTLMDFIPPPPSNPPPPGGHVYDTATRRQ 1175 

Ml': M Ml I I II I: MM 

Db 1179 plsavaggtqnryqitptnqhppqlpay-fattgpggavpp-nhl-pfatqrhaaseyqa 1235 

Qy 1176 -LNRGSIPREDTYDS VSDGAFARVDVNA— RPTSRNRNL— 1211 

II : M I M : I I: III I : 

Db 1236 glnaarcaqsracascdalatpspmqppppvpvpegwyqpvhpnshpmhptssnhqiyqc 1295 

Qy 1212 GGRPLRGRRDDDSQRSSLMMDDDGGSSEADGENSEGDVP 1250 

• .1 M : I ::: I I:: I : I 
Db 1296 ssecsdhsrssqshkrqlqleehgssakqrgghhrrrap 1334 



Y08401 

ID Y08401 standard; Protein; 1395 AA. 
XX 

AC Y08401; 

DT 24-JUL-1999 (first entry) 

XX 

DE Drosophila sp. ROBOl protein, 

XX 

KW ROBOl; ROB02; roundabout; nerve guidance; human; murine; cell function; 
KW cell morphology; screening assay. 
XX 
OS 
XX 



Drosophila sp. 
WO9920764-A1. 



XX 

PD 29-APR-1999. 
XX 

PF 20-OCT-1998; 98WO-US22164 . 
XX 

PR 14-NOV-1997; '97US-0971172. 
PR 20-OCT-1997; '■97DS-0062921. 

xx ! ' 

PA (REGC ) UNIV CAIjIFORNIA. 
XX 

PI Goodman CS, Ridd T, Mitchell RJ, Tear G; 
XX 

DR WPI; 1999-312615/26. 

DR N-PSDB; X57250'. : 



Robo polypeptide's, a new immunoglobulin superfamily member 
Claim 1; Page 45-49; 80pp; English. 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C. elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides. The 
probes and primers are also useful in screening assays. 

Sequence 1395 ;AA; 



Query Match 

Best Local Similarity 



23.1%; 
31.0%; 



Score 1588; DB 20; Length 1395; 
Pred. No. 2. le-89; 



Mon Jan 22 13:04:31 2001 
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Matches 421; Conservative 195; Mismatches 527; Indels 216; Gaps 41; 

Qy 29 APVIIEHPIDVWSRGSPATLNC--GMPSTAKITWYKDGQPVITNKEQVNSHRIVLDTG 86 

:| lllll |:|| : MINI II I |:|||:|| ||::: III: I 
Db 55 spriiehptdlvvkknepaUnckvegkpept-iewfkdgepvstnekk--shrvqfkdg 111 

Qy 87 SLFLLKVNSGKNGKDSDAGAYYCVASKEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGG 146 

:M : 111:11 hill I |: I 1 1 : : 1 : 1 1 : 1 1 1 1 |: : I 
Db 112 alffyrtraqgk--keqdggeywcvaknrvgqavsrhaslqiavlrddfrvepkdtrvakg 169 

Qy 147 EMAVLECS PPRGFPEPWSWRRDD KELRIQDMPRYTLHSDGNLI IDPVDRSDSG 200 

I hill I: Mhlll | : | : Hi: |: | I 
Db 170 etallecgppkgipeptliwikdgvplddlkamsfgassrvrivdggnllisnvepideg 229 

Qy 201 TYQCVANNMVGERVSNPARLSVFEKPKFEQEPKDMIVDVGAAVLFDCRVTGDPQPQITWK 260 

hhl 1:11 I h 1:1 I II I MM : I I I I III h: II 
Db 230 nykciaqnlvgtressyaklivqvkpyfmkepkdqvmlygqtatfhcsvggdpppkvlwk 289 

•261 RKNEPMPVTRAYIARDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQT 320 
:lhll I I : I I : hill III I! I : I I I I llhl 
Db 290 keegnipvsrarilhdeksleisnitptdegtyvceahnnvgqisaraslivhappnftk 349 

Qy 321 KPADQSVPAGGTATFECTLVGQPSPAYFWSREGQQDLLFPSYVSADGRTRVSPTGTLTIE 380 

:h:: II I I I h Ihlll hlh h II h III I 
Db 350. rpsnkkvglngwqlpcmasgnpppsvf wtkegvstlmf pn • • sshgrqyvaadgtlqit 407 

Qy 381 EVRQVDEGAYVCAGMNSAGSSLSRAALRATFETRGRVQRRRSKMGRQRQRNVQSIIRYLI 440 

:|ll III IN: : || : |: : 
Db 408 dvrqedegyyvcsafswdsstvrvflqvs 437 

Qy 441 SAV1GNTPAKPPPIIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDIIDSRI 500 

: :lll h I Nil II I 1 1 1 : 1 : 1 hi III: :| 
Db 438 s vderppp 1 iqigpanqtlpkgs va tlpcratgnpspr ikwf hdghavq - agnr y 491 

Qy 501 SQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPS 560 

I II : II: hi III I I l|::hhllll I : I III :h 
Db 492 siiqgsslrvddlqlsdsgtytctasgergetswaatltvekpgsts-lhraadpstypa 550 

Qy 561 SPTQPIIVNVTDTEVELHW-NAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEY 618 

I I ::lh I : I I : lllll ::|:|lll I I h 
Db 551 ppgtpkvlnvsrtsislrwaksqekpgavgpiigytveyfspdlqtgwivaahrvgdtqv 610 

Qy 619 RIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLT 678 

I II I lhh:|||l :|| II I :: I : I :: : |:: I II 
611 tisgltpgtsyvflvraentqgisvpsglsnviktieadfdaasan— -dlsaartllt 666 

•679 SEQLIKLEEVKTINSTAVRLFWKKRRL* -EELIDGYYIRWRGPPRTNDNQY- -VNVTSPS 734 

: 4I::|III I h "I I :: II : I I 

667 gks-velidasainasavrlewmlhvsadekyveglrihyk-dasvpsaqyhsitvnidas 724 

Qy 735 TENYWSNLMPFTUYEFFVIPYHSGVHSIBGAPSNSMDVLTAEAPPSLPPEDVRIRMLNL 794 

I : : I i II : I ' 1 1 1 1 : h :| I llll II I II lh:::| I I 
Db 725 aesfvvgnlkkytkyeffltpf---fetiegqpsnsktaltyedvpsappdDiqigmynq 781 

Qy 795 TTLRISWKAPKADGINGILKGFQI-VIVGQAPNNNRNITTNERAASVTLFHLVTGMTYKI 853 

I : I I : II I h:| II hi I II I :| II I : 
Db 782 tagwvrwtpppsqhhngnlygykievsagntmkvlannitlnatttsvllnnlttgavysv 841 

Qy 854 RVAARSNGGVGVSHGTSEVIMNQDTLEKHLA AQQENESFLYGL INKSHVP - 903 

h : : I I : h I I : I : I I ::| 

Db 842 r Ins f tkagdgpys kpi s 1 f md -pthhvhpprahpsgthdgrhegqdltyh- - nngaipp 898 

Qy 904 VIVIVAILIIFWIIIAYCYWRNSRNSDGRDRSFIRIND 942 

|:| : :|:: : |: h: : ::| 
Db 899 gdinptthkkttdylsgpwlmvlvcivllvlvisaaisnivyfkrkhqnitkelghlswsd 958 

Qy 943 GSVHMASHN LWDVAQNPNQNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDY 998 

: : I II : : : II : : I I ::| :|| I 
Db 959 neitalninskeslw idhhrgwrtadtdkdsglseskllshvnssqsnynnsd- 1011 

Qy 999 SGTMHRPGSEHHYHYAQLTGGPGNAMSTFYG-NQYHDDPSPYATTTLV 1045 

II II:: ::||| : hhlllll :: 

Db 1012 ggt dyaev---dtrnlttfyncrkspdnptpyattmiigtsssetctktt 1058 



Qy 1046 ' LSNQQPA- -WLNDKMLRAPAMPTN PVPPE--PPARYAD 1079 

: I 1 1 : hill I 1 1 1 1 1 : 
Db 1059 sisadkdsgthspysdafagqvpavpwXsnylqypvepinwseflppppehpppsstyg 1118 

Qy 1080 HTAG--RRSRSSRASDGRG TLNGGLHHRTSGSQRS DSPPHTDVSY 1122 

: I II I I I I II :| :|l : II I 

Db 1119 yaqgspessrkssksagsgistnqsilnasihssssggfsawgvspqyavacppenvysn 1178 

Qy 1123 VQLHSSDGTGSSRERTGERRTPPNKTLMDFIPPPPSNPPPPGGHVYDTATRRQ 1175 

: II :". : I : II I I III: I hi 
Db 1179 plsavaggtqnryqitptnqhppqlpay-fattgpggavpp-nhl-pfatqrhaaseyqa 1235 

Qy 1176 - LNRGSTPREOT YDS VSDGAFARVDVNA— RPTSRNRNL— 111! 

II : :l I :| : I h III I : 

Db 1236 glnaarcaqsracnscdalatpspmqppppvpvpegwyqpvhpnshpmhptssnhqiyqc Y&1 

Qy 1212 GGRPLKGKRDDDSQRSSLMMDDDGGSSEADGENSEGDVP 1250 

I I : I ::: I h: I : I 
Db 1296 ssecsdhsrssqshkrqlqleehgssakqrgghhrrrap 1334 



RESULT 5 ', 
Y13566 

ID Y13566 standard;' Protein; 1651 AA. 
XX 
AC 
XX 
DT 
XX 



Y13566; 

30-JUL-1999 (first entry) 



DE Human Robo 1 polypeptde. 



Coram polypeptide'; Robo polypeptide; commissureless; roundabout; 
modulation; nerve cell function. 



Homo sapiens. 



W09925833-A1. 



13- NOV-1998; 98WO-US24327 . 

14- NOV-1997; 97US-0065543. 



(REGC ) ONIV CALIFORNIA. 
Goodman C, Kid T, Mitchell RJ, 



WPI; 1999-338008/28. 
N-PSDB; X55770, ' 

Modulation of Robo-Comm polypeptide interactions 

Disclosure; Page 44-48; 56pp; English, 

The invention relates to a method for modulating the amount of Comm 
(commissureless). polypeptide in contact with a cell expressing active 
Robo (roundabout) on its surface, The method comprises modulating the 
effective amount' of Coiran polypeptide in contact with the cell, where the 
amount of expressed active Robo is specifically modulated inversely with 
the modulation of the effective amount of Comm in contact with the cell . 
The method' is used to modulate the amount of active Robo expressed on a 
cell. The method can be used to screen for agents that modulate Robo: Comm 
interactions., This is particularly useful for modulating nerve cell 
function. 



SQ Sequence 1651 AA; 



Query Match 21.9*; Score 1500.5; DB 20; Length 1651; 

Best Local Similarity 32,14; Pred. No. 6,5e-84; 

Matches 408; Conservative 166; Mismatches 480; Indels 219; Gaps 36; 



Best Available Copy 

Mon Jan 22 13:04:31 2001 
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Oy 30 PVIIEHPIDVWSRGSPATLNCGAK-PSTAKITWYKDGQPVITNKEQVNSHRIVLDTGSL 88 

I hill l::!h! IIMII I: I I III h I hi: lll::| HII 
Db 68 privehpsdiivskgepatlnckaegrptptfewykggenetdkddprshMlpsgsl 127 

Qy 89 FLLKVNSGKNGKDSDAGAnCVASHEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGEM 148 

I I" I: : Ml III I II l|: 1 1 ::| :| I :| 1 1 I I II 
Db 128 fflrivhgrksr-pdegvyvcvarnylgeavahn2slevailrddfrqnpsdvmvavgep 186 

Oy 149 AVLECSPPRGFPEPVVSWRRDDRELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 

11:11 Mil III :lhll I :| I |: I |:| :||:| I II I 
Db 187 avinecqpprghpeptiswkkdgsplddkd-eritirg-gklmltytrksdagkyvcvgtn 244 

Oy 209 MVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQnWRRKNEPMPV 268 

lllll I I hi hi I : I :: I I : I I III I : |:: : :| 
Db 245 mvgeresevaeltvlerpsfvkrpsnlavtvddsaefkceargdpvptvrwrkddgelpk 304 

Qy 269 TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVP 328 

:| I :|: hi : I I I I I I I Nil I II II I II II I 
Db 305 sr-yeirddhtlkirkvtag<Jmgsytcvaenmvgkaeasatltvqepphfvvkprdqvva 363 

t 329 AGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSY"VSADGRTKVSPTGTLTIEEVRQVD 386 
I I 11:1 I I II II :H hill II : I II II III I:: I 
364 lgrtvtfqceatgnpqpaifwrregsqnllf-syqppqsssrfsvsqtgdltitnvqrsd 422 

Qy 387 EGAYVCAGMNSAGSSLSRAALRATFETRGRVQRRRSRMGRQRQR1JVQSIIRYLISAVTGN 446 

I hi :| III ::H I: II 

Db 423 vgyyicqtlnvagsiitkayle vtdv 448 

Qy 447 TPARPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQHSTG 506 

I I III: I : :| I |:| II I I :||: : Mil I I 
Db 449 iadrpppvirqgpvnqtvavdgtfvlscvatgspvptilwxkdgvlvstqdsrikqleng 508 

Qy 507 SLHIADLRRPDTGVYTCIARNJ2DGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPI 566 

II Mil lllll Ihllll : |:: II II: ll:|::| 

Db 509 vlqiryaklgdtgrytciastpsgeatwsayievqefgvpvqpprptdpnlipsapskpe 568 

Qy 567 IVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGLKPS 626 

: :h I I I lllll III: :l I :| : : I : UNI: 
Db 569 vtdvsrntvtlswqpnlnsgatp-tsyiieafshasgsswqtvaenvktetsaikglkpn 627 

Qy 627 HSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKKKMDMAIAERRLTSEQLIRLE 686 

:|::M I II II I II II :| :: I :: I 
Db 628 aiylflvraanaygisdpsqisdpvkt qdvlptsqgvdhkqvqrel-gnavlhlh 681 

Qy 687 EVRTINSTAVRLFWRKRKLEELIDGYYIKWRGPPRTNDNQ-— YVNVTSPSTENYWSN 742 

::|::: : I : : I II I :l I I : I :|: : h : 
Db 682 nptvlssssievhwtvdqqsqylqgykilyr-psganhgesdwlvfevrtpaknsvvipd 740 

« 743 LMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRML-NLTTLRIS 800 
,1 III I: : II I I I II II: I : I I : :| 
741 lrkgvnyeikarpf ■ • ■ f nef qgadseikf aktleeapsappqgvtvskndgngtailvs 797 
*» 

Qy 801 WKAPRADGINGILRGFQIVIVGQAPNNNRNITTNERAASVTIiFHLVTGHTYRIRVAARSN 860 

I: I I II::: ::: :| : I I : II : II I: I : III : 
Db 798 wqpppedtqngmvqeykvwclgnetryhinktvdgstfswipflvpgirysvevaastg 857 

Qy 861 GGVGV SHGTSEVIMNQDTLEKHLAAQQENESFLYGLINRSHVPVIVIVAI 910 

I II :M :l :| : :: : :|: |: : I 
Db 858 agsgvksepqfiqldahgnpvspedqvslaqqisdvvkqpafiagigaacwi i 910 

Qy 911 LIIFWIIIAYCYWRNSRNSD — GKDRSF I K I NDGSVHMASN — NLWDVAQN 958 

|::| : : : II I I II : I ::| I :::: 
Db 911 lmvfsiwlyrhrkkrngltstyagirkvpsftftptvtyqrggeavssggrpgllnise- 969 

Qy 959 PNQNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTG 1018 

II I II I I : :| II II : II 

Db 970 paaqpwladtwpntgnnhndcsiscctagngnsdsnlttys — rpadclanynnqldn 1025 

Qy 1019 GPGHAM — STFYG NQYHD -DPSPYATTTLVLSN — 1048 

II II II I: :: MUM |: I 

Db 1026 kqtnlmlpestvygdvdlsnkinemktfnspnlkdgrfvnpsgqptpyattqliqsnlsn 1085 



Qy 1049 QQ PAWLN— DKMLRAPAMPTNPVPPEPP- -ARYADH 1080 

II I I I : III | I : 

Db 1086 nmnngsgdsgekhwkplgqqkqevapvqyniveqnklnkdyrandtvpptipynqsydqn 1145 

Qy 1081 TAGRRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSRERTGE 1140 

II : I II lllll I I 

Db 1146 tggsynssdrgss tsgsq ghkk — g 1168 

Qy 1141 RRTP--PNKTLM— DFIPPPPSNPPP PGGHVY— D 1169 

III I : I I :IMI::IM I :| I 

Db 1169 artpkvpkqggmnwadllppppahppphsnseeynisvdesydqempcpvpparmylqqd 1228 

Qy 1170 TATRRQLNRGSTP 1182 
: II I 

Db 1229 eleeeedergptp 1241 



RESULT 6 
Y08404 

ID Y08404 standard;, Protein; 1649 AA, 

XX ! 

AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 



Y08404; 

24-JDL-1999 (first entry) 
Human ROBOl protein, 

ROBOl; ROB02; roundabout; nerve guidance; human; murine; cell function; 
cell morphology; screening assay, 

Homo sapiens. 

WO9920764-A1. ; 

29-APR-1999. 



20-OCM998; 98WO-US22164 . 



14-NOV-1997; 
20-OCM997; 



•97DS-0971172. 
97US-0062921. 



(REGC ) UMV CALIFORNIA. 

Goodman CS, Kidd T, Mitchell RJ, Tear G; 



WPI; 1999-312615/26. 
N-PSDB; X08404. 

Robo polypeptides, a new immunoglobulin superfamily member 
Claim 1; Page 65-71; 80pp; English, 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp,, 
C, elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides, The 
probes and primers are also useful in screening assays. 

Sequence 1649 AA; 



Query Match 21.7%; Score 1489.5; DB 20; Length 1649; 

Best Local Similarity 32.3%; Pred. No. 3.1e-83; 

Matches 405; Conservative 179; Mismatches 487; Indels 183; Gaps 37; 

Qy 30 PVIIEHPIDVWSRGSPATLNCGAK-PSTAKITWYKDGQPVITNKEQVNSHRIVLDTGSL 88 

11:111 ::ll:l llllll I: I I IM I: I hi: lll::| HII 
Db 68 privehpsdlivskgepatlnckaegrptptiewykggervetdkddprshrmllpsgsl 127 

Qy 89 FLLRVNSGKNGRDSDAGAYYCVASNEHGEVRSNEGSLRLAMLREDFRVRPRTVQALGGEM 148 

I I:: |:<: I I I III I II |: IMMUII I I II 
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Db 128 fflrivhgrksr-pdegvyvcvarnylgeavshnaslevailrddfrqnpsdvmvavgep 186 



t 



149 AVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 

11:1! Mil III :lhll I |:| :||:| I I I 

187 avmecqpprghpeptisvkkdgsplddkd-eritirg-gklmitytrksdagkyvcvgtn 244 

209 MVGERVSNPARLSVFEKPKFEQEPRDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMPV 268 

Mill I I 1:1 1:1 I : I :: I I : I I III I : I : : : :| 
245 mvgeresevaeltvlerpsfvkrpsnlavtvddsaefkceargdpvptvrwrkddgelpk 304 

269 TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVP 328 

. :l I :| I I I I I I llll I II || | || || | 

305 sr-yeirddhtlkirkvtagdmgsytcvaenmvgkaeasatltvqepphfwkprdqwa 363 

329 AGGTATFECTLVGQPSPAYFWSKEGQQDLLPPSY--VSADGRTKVSPTGTLTIEEVRQVD 386 

I I 11:1 I I II II :ll hill II : I II II III |:: I 
364 lgrtvtfqceatgnpqpaifwrregsqnllf-syqppqsssrfsvsqtgdltltnvqrsd 422 

387 EGAYVCAGMNSAGSSLSKAALKATFETKGRVQKKKSKMGKQKQKNVQSIIKYLISAVTGN 446 

I hi :l 11,1 ::H h II 
423 vgyyicqtlnvajsiitkayle vtdv 448 

447 TPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQHSTG 506 

:HI I l|MI|: I : :| I hi I I I I :||: : llll I I 
449 iadrpppvirqgpjvnqtvavdgtfvlscvatgspvptilwrkdgvlvstqdsrikqleng 508 

507 SLHIADLKKPDTGVYTCIAKKEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPI 566 

I I I llflnill 11:111 : I:: I I ||: lhh:l 
509 vlqiryaklgdt|rytciastpsgeatwsayievqefgvpvqpprptdpnlipsapskpe 568 

567 IVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGLKPS 626 

: :h II Si III I I III: :l I :| : : I : lllllh 
569 vtdvsrntvtlswqpnlnsgatp-tsyiieafshasgssvqtvaenvktetsaikglkpn 627 

627 HSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLE 686 

I:|::|| I ill II I I I II :| :: I :: | 

628 aiylflvraanayglsdpsqisdpvkt qdvlptsqgvdhkqvqrel -gnavlhlh 681 

687 EVKTINSTAVRLFWKKRKLEELIDGYYIKWRGPPRTNDNQ — YVNVTSPSTENYWSN 742 

"I::: : I : : I II I :| I I : I :|: : h : 
682 nptvlssssievhwtvdqqsqyiqgykilyr-psganhgesdwlvfevrtpaknsvvipd 740 

!< 

743 LMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRML--NLTTLRIS 800 

I III li : II I I I II II: I : M : :| 
741 lrkgvnyeikarpf—fnefqgadseikfaktleeapsappqgvtvskndgngtailvs 797 

wrapradgingi4gfqivivgqapnnnrnittnermsvtlfhlvtgmtyrirvaarsn 860 



801 

I: I I II: 
798 wqpppedtqngmvi 



I : I I : Ihllhhlll: 
'(jeykwclgnetryhinktvdgstf swipf lvpgirysvevaastg 857 



Qy 861 GGVGV SHGTSEVIMNQDTLEKHLAAQQENESFLYGLINKSHVPVIVIVAI 910 

I II :M :| :| : :: : :|: h : I 
Db 858 agsgvksepqfiqldahgnpvspedqvslaqqisdwkqpafiaglgaacwi i 910 

Oy 911 LI I FW 1 1 1 AYCYWRNSRNSD — GKDRSF IKINDGSVHMASN- ■ -NLWDVAQN 958 

h:| : : : II I III : I ::| I :::: 
Db 911 lmvfsiwlyrhrkkrngltstyagirkvpsftftptvtyqrggeavssggrpgllnise- 969 

Qy 959 PNQNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLIG 1018 

II I II I :: I : :| II II : II 

Db 970 paaqpwladtwpntgnnhndcsiscctagngnsdsnlttya — rpadclanynnqldn 1025 

Qy 1019 GPGNAM— STFYGNQYHDDPSP.YATTTLVLSNQQPAWLND-KMLRAPAMPTNPV-PPE 1072 

I I II II: : III: :|: I :| : I 

Db 1026 kqtnlmlpestvygd vdlsnk — ineraktfnspnlkdgrfvnpsg 1068 

Oy 1073 PPARYADHTAGRRSRSSRASDGRGTLNGGLHHRTSGSQR llll 

I II I : = h "I I :| I : I h 
Db 1069 qptpya--ttiqsnlsnnmnngsgd-sgekhwkplgqqkqevapvqyniveqnklnkdyr 1125 

Oy 1112 -SDSPPHT — DVSYVQ LHSSDGTGSSKERTGER— RTP-PNKTLM— DFIP 1154 

:h I I : II I :lll h I : III I : I I :| 
Db 1126 andtvpptipynqsydqntggsynssdrgsstsgsqghkkgartpkvpkqggmnwadllp 1185 



Qy 1155 PPPSNPPP PGGHVY- - -DTATRRQLNRGSTP 1182 

1 1 1 1 1 1 I :| I : II II 

Db 1186 pppahppphsnseeynisvdesydqempcpvpparmylqqdeleeeedergptp 1239 



Y08402 

ID Y08402 standard; Protein; 1380 AA. 



XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
XX 
OS 
XX 

PN WO9920764-A1. 
XX 
PD 



Y08402; 

I" 

24-JUL-1999 (first entry) 

Drosophila sp. ROB02 extracellular domain protein. 

ROB01; ROB02 ; roundabout; nerve guidance; human; murine; cell function; 
cell morphology; screening assay. 

Drosophila sp. - 



29-APR-1999. ' 
20-OCM998; 98WO-OS22164. 



14-NOV-1997; 
20-OCM997; 



97US-0971172. 
97US-0062921. 



(REGC ) UNIV- CALIFORNIA. 

Goodman CS, Kidd T, Mitchell KJ, Tear G; 

WPI; 1999-312615/26. 
N-PSDB; X57251.- 

Robo polypeptides, a new Immunoglobulin super family member 



XX 
PT 
XX 

PS Claim 1; Page 52-56; 80pp; English, 
XX ! 

CC This invention describes novel Robo (roundabout) polypeptides, involved 

CC in nerve guidance which heve been isolated from Drosophila sp., 

CC C. elegans, human and murine samples. The products of the invention can 

CC be used to raise anti-Robo antibodies, which can be used to modulate cell 

CC function or morphology. The Robo polynucleotides and fragments are useful 

CC as probes and primers and for production of the Robo polypeptides. The 

CC probes and primers are also useful in screening assays. 
XX ♦ 

SQ Sequence 1380.' AA; 



Query Match 19.7%; Score 1350; DB 20; Length 1380; 

Best Local Similarity 27.04; Pred. No. 9 . le-75; 

Matches 386; Conservative 217; Mismatches 494; Indels 334; Gaps 45; 

Qy 30 PVIIEHPIDVWSRGSPATLNCGAKPS-TAKITWYKDGQPVITNKEQVNSHRIVLDTGSL 88 

I MM!: | : | | || I: : | | |:|||: : I lllhl I I 
Db 4 priiehpmdttvpkndpf tf ncqaegnptptiqvfkdgrel- - -ktdtgshrimlpaggl 60 

Qy 89 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGEM 148 

I III : ::llll hi llll M : I : : i : ! I : : 1 1 : | : ||: 
Db 61 fflkvihsr-resdagtywceaknefgvarsrnatlqvavlrdefrlepantrvaqgev 118 

Qy 149 AVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 

MM iir III :||||: : I : I : III I Ml I llll I 
Db 119 almecgaprgspepqiswrkngqtlnlvgnkririvdggnlaiqearqsddgryqcvvkn 178 

Qy 209 MVGERVSNPASLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKR • • KNEPM 266 

ill I M I M : h: I 1 1 : : I : I M III I : hi I 
Db 179 wgtresataflkvhvrpflirgpqnqtavvgssvvfqcriggdplpdvlwrrtasggnm 238 



Qy 267 PVT-- 



■ -RAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPP 316 



Mon Jan 22 13:04:31 2001 



us-09-540-245a-17 .rag 



Page 8 



I: I :| I M: I I 1 1 1 I I II : I : I I 1 1 1 

39 plrkfswlhsasgrvhvled-rslklddvtlednigeytceadnavggitatgiltvhapp 297 

17 SFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPT-- 374 

I :l :l I I III III :ll II II I I III :|: I 

98 kfvirpknqlveigdevlfecqanghprptlywsvegnssmpgy-rdgrnievtltpe 355 

175 - -GTLTIEEVRQVDEGAYV-CAGMNSAGSSLSK- -AALKATFETKGRVQKKKSKMGKQKQ 429 

hi Mill :|: II I: :: II 
156 grsvlsiarfaredsgkwtcnalnavgsvssrtwsvdtqfel 399 

:30 KMVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRD 489 

III II I llll I I :IN: I I I :H I 
■00 pppiieqgpvnqtlpvksiwlpcrtlgtpvpqvswyld 438 

,90 GLPIDITD ■ SRISQHSTGSLHIADLKK - PDTGVYTCI AKNEDGESTWSASITVEDHTS - N 546 

hill: : I : hi hlh: I hllhl I : 1 : 1 : 1 1 I :: h I 
,39 gipidvqehermlsdagaltisdlqrhedeglytcvasnrngkssvsgylrldtptnpn 498 

147 AQFVRMPDPSNF#6SPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQyYSPDLGQTW 606 
I I h I :| I :| :| : I I I : | : ||:|: : : | 

99 ikffrapelstypgppgkpqmvekgensvtlgwtrsnkvggsslvgyviemfgknetdgw 558 

i07 FNIPDYVASTEYRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKN 666 

: I :l : II I =1 hlllll h HIM I 
,59 vavgtrvqnttftqtglipgvnyffliraenshglslpspmsepitvgtryfnsgl---- 614 

>67 KMDMAIAEKRLTSEQLIKLEEVKTINSTAVRLFWKKRKLEELIDGYYIKWR 717 

h: I II :::| ::||:::| |: : ::|:|: I 
il5 "dlsearasllsgdvvelsnaswdstsnikltwqiin-gkyvegfyvyarqlpnpivnn 671 

18 -GPPRTNDNQYVNVTSPS 734 

I :| I : II I 

172 papvtsntnpllgststsasasasasalistkpniaaagkrdgetnqsgggaptplntky 731 

35 TENYWSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPP 784 

: :: h :| lll|::|:: h Mill ' II II I 
32 rmltilngggassctitglvqytlyeffivpfyk---svegkpsnsriartledvpseap 788 

85 EDVRIRMLNLTTLRISWKAPKADGINGILKGFQIVIVG--QAPNNNR- ■ -NITTNERAAS 839 

: :IM : : lllh :|:| : ::: | I I :| |:| : : : 
89 ygniealllnssavflkwkapelkdrhgvllnyhvivrgidtahnfsrlltnvtidaaspt 848 

Qy 840 VTLFHLVTGMTYKIRVAARSNGGVGVSHGTSEVIMNQDTLEKHLAAQQENESFLYGLINK 899 

M :| IM Mil :| III : I M I M: : 
Db 849 lvlanltegvmytvgvaagnnagvgpy-cvpatlrldpitkrl dpfln— qr 897 

Qy 900 SHVP VIVIVAILIIFWIIIAYCYWRNSRNSDGKDRSFIKINDGSVH-MASN 950 

II :h: III : :: I : : I : : ::M I 

(b 898 dhvndvltqpwfiillgailavlmlsfgamvfvk rkhiranmkqsalntmrgn 948 
y 951 NLWDVAQNPNQNPMYNTAGRMTMNNRNGQAIYSLTPNAQDFFNNCDDYSGTMHRPG 1006 

: II M: :: I II II I I Ml 

Db 949 htsdvlkmps lsarngngywldsst ggmvwrpspggd 985 

Qy 1007 SEHHYHYAQLTGGPGN 1022 

:| II M II: 

Db 986 slemqkdhiadyapvcgapgspagggtssggsggagsgasggddihgghgsernqqryvg 1045 

Qy 1023 AMSTF YGNQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPT 1066 

:|:l II : Mill:::: :|| : :| 
Db 1046 eysniptdyaevssfgkapseygrhgnaspapyatssilsphqq qqqqqpryqq 1099 

Qy 1067 NPVP PEPPARYADHTAGRRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTDV 1120 

III II I :: : :| I : h I ::: 

Db 1100 rpvpgyglqrpmhp hyqqqqhqqqqaqq thqqhqalqqhqqlppsni 1146 

Qy 1121 SYVQLHSSD GTGSSKERTGER-RTPPNKTLMDFIPPPPSNPPPPGGHVYDTATRR 1174 

I h :: II h h Ml I II MM 
Db 1147 -yqqmsttseiyptntgpsrsvyseqyyypkdkqrhihienklsn chtyeaapga 1200 

Qy 1175 QLNRGSTPREDTYDSVSDGAFARVDVNARPTSRNRNLGGRPLKGK-RDDDSQRSSLMMD 1232 
M: | ::| :| I : M 



Qy 

Db 

Qy 

Db 
Qy 

Db 

Qy 
Db 
Qy 

t 

Db 

Qy 
Db 
Qy 
Db 
Qy 
Db 
Qy 
Db 



Db 1201 ---kqsspissqfas- 



•-vrrqqlppncsigresarfkvlntdqgknqqnlld 1247 



Qy 1233 DDGGSSEADGENSEGDVPRGGVRKAVPRMGISASTLA HSCYGT 1275 

III :h I I: II 

Db 1248 ldgssmcyngladsg cggspspmamlmshedehalyht 1285 



Y13564 

ID Y13564 standard; Protein; 1381 AA. 
XX 

AC Y13564; 

xx 

DT 30-JUL-1999 (first entry) 



Drosophila Robo 2 polypeptde. 

Comm polypeptide; Robo polypeptide; commissureless; roundabout; 
modulation; nerve cell function. 



XX 
KW 
KW 
XX 

OS Drosophila sp. 



XX 



W09925833-A1. 
27-MAY-1999. 

13- NOV-1998; 98WO-US24327. 

14- NOV-1997; ,97tJS"0065543. 
(REGC ) UNIV CALIFORNIA. 

Goodman C, Kid-T, Mitchell kj, Russell c, Tear G; 

WPI; 1999-338008/28. 
N-PSDB; X55768. 

Modulation of Robo-Comm polypeptide interactions 
Disclosure; Page 34-38; 56pp; English. 

The invention relates to a method for modulating the amount of Coram 
(commissureless) polypeptide in contact with a cell expressing active 
Robo (roundabout) on its surface. The method comprises modulating the 
effective amount of Comm polypeptide in contact with the cell, where the 
amount of expressed active Robo is specifically modulated inversely with 
the modulation of the effective amount of Comm in contact with the cell. 
The method is used to modulate the amount of active Robo expressed on a 
cell. The method can be used to screen for agents that modulate Robo: Comm 
interactions. This is particularly useful for modulating nerve cell 
function, 



Sequence 1381 AA; 



Query Match . ' 19,64; Score 1344.5; DB 20; Length 1381; 

Best Local Similarity 26,8%; Pred. No. 2e-74; 

Matches 386; Conservative 216; Mismatches 486; Indels 353; Gaps 

Qy 30 PVIIEHPIDmSRGSPATLNCGAKPS-TAKITWYKDGQPVITNKEQVNSHRIVLDTGSL e8 

I HUM M || II h M I MM: : | | | 

Db 4 priiehpmdttvpkndpf tf ncqaegnptptiqwf kdgrel- - -ktdtgshrimlpaggl ?C 

Qy 89 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGEM 148 

I III : MM M I || | :| :h:|:||::||: I : lh 
Db 61 fflkvihsr-resdagtywceaknefgvarsrnatlqvavlrdefrlepantrvaqgev 118 

Qy 149 AVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 

Ml III III MM M : I : III I Ml llll I 
Db 119 almecgaprgepepqiswrkngqtlnlvgnkririvdggnlaiqearqsddgryqcvvkn 178 

Qy 209 MVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPOPQITWKR--KNEPM 266 
:ll II M I :| : M I IMM lh III MM I 
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Db 179 vvgtresataflkvhvrpflirgpqnqtavvgssvvfqcriggdplpdvlvrrtasggnm 238 

Qy 267 PVT RAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPP 316 

I: I :: :l I I::: I I 1 1 1 I I I I : I : I I 1 1 1 

Db 239 plrkfswlhsasgrvhvled-rslklddvtledmgeytceadnavggttatgiltvhapp 297 

Qy 317 SFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPT • ■ 374 

I :l :| I I III I I I :H II II I I III :|: I 

Db 298 kf virpknqlveigdevlf ecqanghprptlywsvegnss lllpgy- -rdgrmevtltpe 355 

Qy 375 - -GTLTIEEVRQVDEGAYY-CAGMNSAGSSLSK- -AALKATPETKGRVQKKKSKMGKQKQ 429 

1:1 : I I I I ;h II I: :: II 

Db 356 grsvlsiarfaredsgkwtcnalnavgsvssrtvvsvdtqfel 399 

Qy 430 KNVQSIIRYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRD 489 

III II I Mil I I :lll: MM I 

Db 400 pppiieqgpvnqtlpvksiwlpcrtlgtpvpqvswyld 438 

»490 GLPIDITD-SRISQHSIGSLHIADLKK-PDTGVYTCIAKNEDGESTWSASLTVEDHTS-N 546 
hill: : I : hi hll- I hllhl I : 1 : 1 : 1 1 I :: hi 

Db 439 gipidvqeherrnlsdagaltisdlqrhedeglytcvasnmgksswsgylrldtptnpn 498 

Qy 547 AQFVRMPDPSNFPSSPTQPIIVNVTDTEVELifflNAPSTSGAGPITGYIIQnSPDLGQTW 606 

:| I h I :| I :| :| : I I I : I : ||:|: : : I 

Db 499 ikffrapelstypgppgkpqmvekgensvtlswtrsnkvggsslvgyviemfgknetdgw 558 

Qy 607 FNIPDYVASTEYRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKN 666 

: I :| : III: MINI h II I :| I 

Db 559 vavgtrvqnttftqtgllpgvnyffliraenshglslpspmsepitvgtryfnsgl — 614 

Qy 667 KMDMA1AEKRLTSEQUKLEEVKTINSTAVRLFWKKRKLEELIDGYYIKWR 717 

h: I II :::| ::||:::| h : ::|:|: I 

Db 615 "dlsearasllsgdwelsDaswdstsnikltwqiin-gkyvegfyvyarqlpDpivnD 671 

Qy 718 -GPPRTNDNQYVNVTSPS 734 

I :l I : II I 

Db 672 papvtsntnpllgststsasasasasalistkpniaaagkrdgetnqsgggaptplntky 731 

Qy ' 735 TENYWSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPP 784 

: :: h :| lllh:|:: h I Mil I I II I 

Db 732 rmltilngggassctitglvqytlyeffivpfyk—svegkpsnsriartledvpseap 788 

Qy 785 EDVRIRMLNLTTLRISWKAPKADGINGILKGFQIVIVG-QAPNNNR— NITINERAAS 839 

: :ll : : lllh :|:| : ::: I I I :| hi : : : 

Db 789 ygraealllnssavflkwkapelkdrhgvllnyhvivrgidtahnfsriUnvtidaaspt 848 

» 840 VTLFHLVTGMTYRIRVAARSNGGVGVSHGTSEVIMNQDTIiEKHLAAQQENESFLYGLINK 899 
: I :| h I : III :l III : I : I I : h : 

Db 849 lvlanltegvmytvgvaagnnagvgpy-cvpatlrldpitkrl dpfin—qr 897 

Qy ' 900 SHVP VIYIVAILI IFWI II AYC YWRNSRNSDGKDRSF IKINDGSVH -MASN 950 

II :|:: III : :: I : : I : : ::: I I 

Db 898 dhvndvltqpwfiillgailavlmlsfgamvfvk rkhmmmkqsalntmrgn 948 

Qy 951 NLWDVAQNPNQNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPG-— 1006 

: II : I: : : 111 III I : II 

Db ^ 949 htsdvlkmps lsarngngywldsst ggmvwrpspggd 985 

Qy 1007 SEHHYHWQLTGGPGN 1022 

:| 11:1 II: 

Db 986 slemqkdhiadyapvcgapgspagggtssggsggagsgasggddihgghgsernqqryvg 1045 

Qy 1023 AMSTF YGNQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPT 1066 

:hl II : hllll:::: :|| : : I 

Db 1046 eysniptdyaevssfgkapseygrhgnaspapyatssilsphqq qqqqqpryqq 1099 

Qy 1067 NPVP PEPPARYADHTAGRRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTDV 1120 

III I I I :: : :| I : |: | ;:; 

Db 1100 rpvpgyglqrpmhp hyqqqqhqqqqaqq thqqhqalqqhqqlppsni 1146 

Qy 1121 SYVQLHSSD GTGSSKERTGERRTPP NKTLMDFIPPPPSNPPPPG 1164 

I h :: II h h I II II 

Db 1147 -yqqmsttseiyptntgpsrsvyseqyyypkdkqrhihitenkl sn 1191 



Qy 1165 GHVYDTATRRQLNRGSTPREDTYDSVSDGAFARVDVNARPISRNRNLGGRPLKGK-RDD 1222 

I h I : hi : I I : I ::| : I I 

Db 1192 chtyeaapga— kqsspissqfas vrrqqlppncsigresarfkvlntd 1238 

Qy 1223 DSQRSSLMMDDDGGSSEADGENSEGDVPRGGVRKAVPRMGISASTLA HSCYG 1274 

: ::| J I I -\ I III: hi 

Db 1239 qg knqqn 1 Id 1 dg s smcyng Lads g cggspspmamlmshedehalyh 1285 

Qy 1275 T 1275 i 
I 

Db 1286 t 1286 



RESULT 9 
W83927 

ID W83927 standard; Protein; 753 AA. 
XX 

AC W83927; 
XX 

DT 01-MAR-1999 (first entry) 
XX 

DE Human T85 protein. 
XX 

KW T85; FHMB-6D4; FMHV-SD4; human; neurological disorder; therapy; 

KW diagnosis. 

XX 

OS Homo sapiens. 



XX 

FH Key 
ft Peptide 
FT 

FT Protein 
FT 

FT Region 

FT 

FT 

FT Region 

FT 

FT 

FT Region 
FT 

FT Region 

FT 

FT Region 
FT 

FT Region 

FT 

FT Region 

FT 

FT Peptide 
FT 

FT Domain 

FT 

FT 

XX 

PN WO9848051-A2. 
XX 

PD 29-OCT-1998. 
XX 

PF 17-APR-1998; 
XX 

PR 10-OCT-1997; 

PR 18 -APR- 1997; 



'Location/Qualifiers 
■1. .20 

,/label- Sig_peptide 
21. .753 

/label- Matj?rotein 
.525. .610 

/note- "has homology to a fibronectin type III 
domain" 

638. .727 

./note- "has homology to a fibronectin type III 
domain" 

'43. .101 

/note- "has homology to a Ig superfamily domain" 
145.. 203 

'/note- "has homology to a Ig superfamily domain" 
'237. .298 

/note- "has homology to a Ig superfamily domain" 



329. 



,433 . 



'247. 



'516. 



394 



'/note- "has homology to a Ig superfamily domain" 



491 



/note- "has homology to a Ig superfamily domain" 



249 



'/note- "RGD motif" 



600 



/note- "cytokine receptor homology N-terminal 
domain" 



97US-0062017. 
97US-0044746. 



PA (MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 
XX ' 

PI Holtzman D, McCarthy SA; 

XX 

DR WPI; 1999-024021/02. 

DR N-PSDB; V69278, 
XX 

PT New isolated human FTHMA-070 and T85 proteins - used to 
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PT products for the diagnosis and therapy of disorders involving 

PT cellular processes, e.g. neuronal development. 

XX 

PS Claim 31; Fig 3; 127pp; English. 
XX 

CC This is the amino acid sequence of a novel human protein designated 

CC T85, and also referred to as FMHB-6D4 and FMHB-SD4 . T85 cDNA (see 

CC V69278) was identified in a human foetal brain cDNA library using a 

CC screen designed to identify genes encoding proteins having a 

CC functional signal sequence. T85 nucleic acids and polypeptides of 

CC the invention are useful as modulating agents in regulating a 

CC variety of cellular processes. They can be used for identifying 

CC compounds which bind to or modulate the activity of the polypeptides 

CC (claimed). They can also be used in screening assays, detection 

CC assays (e.g. chromosomal mapping, tissue typing, forensic biology), 

CC predictive medicine (e.g. diagnostic assays, prognostic assays, 

CC monitoring clinical trials, and pharmacogenomics) , and methods of 

CC treatment (e.g. therapeutic and prophylactic) e.g. for neurological 

CC disorders . 
XX 

SQ Sequence 753 AA; 



Query Match 18.5%; Score 1265; DB 20; Length 753; 

Best Local Similarity 38.3%; Pred. No. 5.9e-70; 



hill I::||:| MINI h I I III h I hh lll::l :|ll 



I I:: I: : I I I III I II I: ll::|:||:lll I I II 
f f lr ivhgrk sr -pdegvyvcvarnylgeavs hnaslevailrddf rqnps dvmvavgep 147 



11:1 Mil III :||:|| I :| I |: I |:| :||:| I II I 



Mill I I 1:1 1:1 I : I :: I I : I I III I : h: 



:! I :|: hi :| I I I I I I I llll III II I II II 



Matches 


Qy 


30 


Db 


29 


Qy 


89 


Db 


89 


Qy 


149 


Db 


148 


Qy 


209 


Db 


206 


Qy 


269 


Db 


266 


Qy 


326 




325 


i 


384 


Db 


384 


Qy 


444 


Db 


410 


Qy 


504 


Db 


,470 


Qy. 


564 


Db 


530 


Qy 


624 


Db 


589 


Qy 


684 



I I I II: !M I || || :|| |:||l || 



I I hi :| III ::ll I'- 



ll II III 



:HI I I III: I : : I I I : I 1 I I I =lh 



III I III Mill Ihllll : : 



I I II: 1 1 : 1 : 



III I I llh :| I :l 



I : llll 



II: l:|::l I II III I I 



II :| :: I :: 
--qdvlptsqgvdhkqvqrel-gnavl 642 



MM I 



I :|: : I 



Db 643 hlhnptvlssssievhwtvdqqsqyiqgykilyr-psganhgesdwlvfevrtpaknsvv 701 

Qy 740 VSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAE 777 

: :l ,111 h : II I 'II 
Db 702 ipdlrkgvnyeikarpf---fnefqgadseikfaktle 736 



RESULT 10 
W06485 

ID K06485 standard; peptide; 1018 AA. 
xx •; 

AC W06485; 
XX 

DT O8-A0G-1997 (first entry) 
XX 

DE Rat contactin ligand for RPTPbeta. 
XX 

KW CAH; receptor type tyrosine phosphatase beta; general ataxia; 

KW amylotrophlc lateral sclerosis; Parkinson's disease; 

KW Alzheimer's disease; Huntington's disease; general neuropathy; 

KW cerebral palsey; neurological trauma; mental retardation, 

XX 

OS Rattus rattus. 

XX 

PN W09637776-A1. 
XX 

PD 28-NOV-1996. 
XX 

PF 23-MAY-1996; : 96WO-US07509 . 
XX 

PR 26-MAY-1995; 95CS-0452052. 
XX 

PA (S0GE-) StJGEN INC. 
XX 

Pi Peles E; 

XX 

DR DPI; 1997-021346/02. 
XX : 

PT Screening for cpds. which alter neural cell characteristics - by 

PT identifying cpds . which modulate the contactin mediated 

PT receptor-type tyrosine-beta carbonic anhydrase domain effects on 

PT neuronal cells ; 

xx : 

PS Disclosure; Page 49-52; 73pp; English. 
XX 

CC Receptor type tyrosine phosphatase beta (RPTPbeta) is expressed in the 

CC developing nervous system and it contains a carbonic anhydrase (CAH) 

CC domain. The CAH domain of RPTPbeta is a functional ligand for contactin, 

CC a GPI -membrane ^anchored neuronal cell recognition molecule that 

CC functions as a 'receptor on neurons. The present sequence is the rat 

CC homologue of contactin which shares 95% identity at the amino 

CC acid level with 'human contactin and 99% with mouse contactin. The 

CC CAH domain of RPTPbeta induces cell adhesion and neurite growth of 

CC primary tectal neurons, and differentiation of neuroblastoma cells. 

CC A novel method for screening for compounds with the ability to alter the 

CC effects of the RPTPbeta CAH domain on neuronal cells involves; growing 

CC neural cells expressing contactin in the presence of a test compound and 

CC the CAH domain 'Of RPTPbeta; detecting a characteristic of the neural 

CC cell selected from adhesion, outgrowth, extension, differentiation, and 

CC survival. The compounds identified by the method can be used to treat 

CC neurological diseases including those characterised by insufficient, 

CC aberrant or excessive neurite growth, differentiation or survival. 

CC They can be used to treat e.g. amyotrophic lateral sclerosis, 

CC general ataxia, Parkinson's disease, Alzheimer's disease, Huntington's 

CC disease, general neuropathy, cerebral palsey, neurological trauma or 

CC mental retardation. 

XX 

SQ Sequence 1018 AA; 



Query Match 8.8%; Score 600.5; DB 18; Length 1018; 

Best Local Similarity 23,7%; Pred. No. 7.5e-29; 

Matches 255; Conservative 124; Mismatches 374; Indels 323; Gaps 39; 



Best Available Copy 
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Page 



i 



4 LGFYHTH - THTHT Y INFDKIPNASNLAPVI IEHPIDWVSRGS ■ • • PATLNCGAKPSTAK 59 

II : I : || |:| ||: : | :||| |: | 

19 1 [eftwhrryghgvseedk gfgpifeeqpintiypeeslegkvslncraraspfp 73 

60 I 'WYK DGQPVITNKEQVNSHRIVLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEH 115 

: II :! :|| I : |:| : I III 1 1 1 : 1 1 1 : 

74 v--ykwrmnngdvdltn drysmvggnlvi nnpdkqkdagiyyclasnny 120 

116 GEVKSNEGSLKLAML REDFRVRPRTVQALGGEMAVLECSPPRGFPEPV-VSWRKDD 170 

I hi I :| I II II I: I: II I II lh : I : 
121 gmvrsteatlsfgyldpfpped- --rpe-vkvkegkpvllcdppyhfpddlsyrwlnef 176 

171 KELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANNMVGERVSNPA-RLSVFEK — 225 

I I: -III I Mill ||:|: III I 

177 pvfitmdkrrfvsqtngnlyianvessdrgnyscf vsspsitksvfskfipl 228 

226 — PKFEQEP KDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMPVTRAYI 273 

I: H II: :| I :| hi I I I:: III I I I 

229 ipiperttkpypadivvqfkdiytinmgqnvtlecfalgnpvpdimkvlepmptt-aei 287 

1 274 AKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVPAGGTA 333 

: hi ■! Ill I I I I I : I : III I : II I 
288 stsgavlkifnfqledeglyeceaenlrgkdkhqariyvqafpevfvehindtevdlgsdl 347 

334 TFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCA 393 

: I 1:1 I I I I I I : :| : I I I 

348 ywpcvatgkpiptirwlkngy ayhkgelrlydvtfenagmyqci 391 



Qy 394 GMNSAGSStSKAALK ATFETKGRVQKKKSKMGKQKQKNVQSIIKYLISAVTGNTP 448 

I: I: : I II III I |:| :::! 
Db 392 aenaygtiyanaelkilalaptfe mnpmkkk ilaa 426 

Qy 449 AKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQHSTGSL 508 

I I: I: I I II : I : III III 
Db 427 kggrviieckpkaapkpkfswsk-gtewlvnssriliwedgsl 468 

Qy 509 HIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHT SNAQFVRM 552 

I :: : I hill hi |:: : :|.: : | : I 

Db 469 einnitrndggiytcfaennrgkanstgtlvitnptriilapinaditvgenatjnqcaas 528 

Qy 553 POPS NF 558 

III II 
Db 529 fdpsldltfvwsfngyvidfnkeithihyqmfmldangellirnaqlkhagrytctaqt 588 

559 PSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDL 602 

II I :: II I h I : : lh I II I 
589 ivdnssasadlvvrgppgppgglriediratsvaltwsrgsdnhs-piskytiq-tktil 646 

Qy 603 GQTWFNI- ■ -PDYVASTEYRIKG- -LKPSHSYMFVIRAENEKGIGTPSVSSALVTISKPA 657 

I : I : I I I I I : I I I I lh I : I I 
Db 647 sddwkdaktdppiiegnmesakavdlipwmeyefrvvatntlgtgepsipsnriktdgaa 706 

Qy 658 AQVALSD KNKMDMAIAEKRLTSEQLIKLE 686 

II II I :| I h I: 

Db 707 pnvapsdvgggggtnreltitwaplsreyhygnnfgyivafkpfdgeewkkvtvtnpdtg 766 

Qy 687 EVKTINSTAVR 697 

II ::|: : 

Db 767 ryvhkeMpstafqvkvkafnnkgdgpysliavinsaqdapseaptevgvkvlssseis 826 

Qy 698 LFWKKRKLEELIDGYYIKWRGPPRTNDNQYVNVTSPSTENYV—VSNLMPFTNYEFFVIP 755 

: I I II:::: I h: I : I III : I : Ihl I I I 
Db 827 vhw-khvlekivesyqiryaghdkeaaahrvqvts - - -qeysarlenllpdtqyf ievga 882 

Qy 756 YHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLR ISWKAPKADGI 809 

:| : I h :: I :llll II II ::::| hi I 
Db 883 ens— -agcgpssdvietftrkappsqpp-- -ri — issvrsgsryiitwdhwalsn 932 

Qy 810 NGILKGFQIVIVGQAPNNNRNITTNERAASVTLFHLVTGMTYKIRVAARSNGGVGY 865 

: h:h :: : :|:: : I : I : I I |:|| II 
Db 933 estvtgykilyrpdgqhdgklfsthkhsievp---iprdgeyvvevrahsdggdgv 985 



# 



RESULT 11 
W42087 

ID W42087 standard;' Protein; 1571 AA. 
XX 

AC W42087; 
XX 

DT 28-SEP-1998 (first entry) 
XX 

DE Human Down syndrome-cell adhesion molecule DS-CAM2, 

XX 

KW DS-CAM2; Down syndrome-cell adhesion molecule; neural cell; 

KW signal transduction; trisomy 21; mental retardation; 

KW holoprosencephaly; corpus callosum agenesis; 

KW schizencephaly; diagnosis; assay; human, 
XX 

OS Homo sapiens. 



CC 



W09817795-A1, 



XX 
PN 
XX 

PD 30-APR-1998. 
XX 

PF 23-OCM997; 
XX 
PR 
XX 
PA 
XX 

PI 

XX 
DR 
DR 
XX 
PT 
PT 
PT 
XX 



25-OCM996; 96US-0029322. 
(CEDA-) CEDARS SINAI MEDICAL CENT. 

Korenberg JR; ' 

WPI; 1998-271791/24. 
N-PSDB; V31988. i 

New isolated Down's Syndrome-cell adhesion molecule - used to 
develop products for detection, diagnosis and therapy of 
developmental and neurological abnormalities 



PS Claim 2; Page 90-95; 109pp; English. 
XX , 

This polypeptide comprises Down syndrome-cell adhesion molecule 

CC DS-CAM2, an extracellular soluble protein belonging to a novel 

CC subclass of the Ig superfamily with highest homology to neural cell 

CC adhesion molecules. Its amino acid sequence was deduced from cDNA 

CC clones (see V31982) isolated from a trisomy 21 foetal brain library, 

CC It is a splice variant of membrane-bound DS-CAMl (see W42Q86), and 

CC lacks the entire transmembrane domain of DS-CAMl, The invention 

CC provides human and murine DS-CAM nucleic acid sequences (see also 

CC V31981, V31985-B7), expression vectors and host cells, transgenic 

CC animals, antibodies, antisense oligonucleotides, and primers 

CC derived from DS.-CAM nucleic acids . DS-CAM polypeptides are associated 

CC with developmental and neurological processes. They can be used in 

CC e.g. neural prosthetic devices used in entubulation methods of 

CC repairing (regenerating) damaged or severed peripheral nerves, and 

CC also in bioassays to identify agonists and antagonists. The products 

CC can also be used in detection, diagnosis and therapy of developmental 

CC and neurological abnormalities such as Down syndrome, mental 

CC retardation, holoprosencephaly, agenesis of the corpus callosum, 

CC or schizencephaly. 



Sequence 1571, AA; 



Query Match ; 8.7%; Score 600; DB 19; Length 1571; 

Best Local Similarity 25.11; Pred. No. 1.4e-28; 

Matches 228; Conservative 129; Mismatches 373; Indels 180; Gaps 

Qy 16 YINFDKIPNASNLAPVIIEHPI DVWSRGSPATLNCGAKPS-TAKITWYKDG 66 

:: lh :|,': l"l : III I :| I I : III I 

Db 386 fvrkdkl-saqdyvqwledgtpkiisafsekwspaepvslmcnvkgtplptitwtldd 444 

Qy 67 QPVITNKEQVNSHRIVLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLK 126 

h: 'I I : I I ::| : I I I I hi I I 
Db 445 dpilkggshrisqmitsegnvlsylniss---sqvrdggvyrctannsagvv 493 
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OS Homo sapiens. 



Qy 


127 LAMLREDFR VRP-RTVQALGGEMAVLECSPPRGFPEPVVSWRKDDKELRIQDMPRY 181 


XX 








1 1:1 :!l : : 1: : | |:| : | |: 1 : 


FH 


Key 


Location/Qualifiers 


Db 


494 lyqarinvrgpasirpmknitaiagrdtyihcr-vigypyysikwyknsnllpfnhr-qv 551 


FT 


Peptide 


1..23 






FT 




: /label- Sig_peptide 


Qy 


182 TLHSDGNLIIDPVDRS-DSGTYQCVANNMVGERVSN PARLSVFEKPKFEQ 230 


FT 


Protein 


24.. 1910 




::| 1 : 1 : 1 II 1! :| ::| 1 : II |:| 


FT 




/label- Mat_protein 


Db 


552 afenngtlklsdvqkevdegeytc-nvlvqpqlstsqsvhvtvkvppf iqpfefprf - ■ 607 


FT 


Domain 


24.. 887 






FT 




/label- IG 


Qy 


231 EPKDMTVDVGAAVLFDC-RVTGDPQPQITWKRKNEPMPVTRAYIAKDN RGLRIERV 285 


FT 




: /note- "immunoglobulin type-C2 domain 




:| 1 1 1:11 III:: 1:1 : : II III : 


FT 


Domain 


'888.. 1594 


Db 


608 sigqrvfipcwvsgdlpititwqkdgrpipgslg-vtidnidftsslrisnl 659 


FT 




'/label- FbN 






FT 




■/note- "fibronectin type III domain" 


Qy 


286 QPSDEGEYVClfARNPAGTLEASAHLRVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSP 345 


FT 


Domain 


,1595.. 1616 




1 1 1 III 1 :l : 1 |: III :| II 1 hill 


FT 




/label- Transmembrane 


Db 


660 slmhngnytciarneaaavehqsqlivrvppkfvvqprdqdgiygkavilncsaegypvp 719 


FT 


Domain 


'1617.. 1910 






FT 




/label- Cytoplasmic 


Qy 


346 AYFW- • SKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVROVDEGAYVCAGMNSAGSSLS 403 


FT 


Region 


'24. .126 




1 II 1 :: :M :| hi I: 1 : 1 1 hi 1 h :| 


FT 




/label- igl 


Db 


720 tiwkfskgagvpqfqp- -ialngriqvlsngsllikhweedsgyylckvsndvgadvs 777 


FT 


Region 


127., 225 






FT 




•/label- Ig2 


I 


404 KAALKATFETKGRVQKKKSKMGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQT 463 


FT 


Region 


•226.. 316 


1 


h II : 1 : : 1 1 


FT 


'/label- Ig3 


T5b 


778 ks myltvki pamitsypntt 797 


FT 


Region 


'317.. 409 






FT 




•/label- Ig4 


Qy 


464 LMV-GSSAILPCQASGKPTPGISWLRDGLPID ITDSRISQHSTGSLHIADLKK 515 


FT 


Region 


•410. .506 




II : 1 1 h : 1 :: h : : =11 : 


FT 




: /label- Ig5 


Db 


798 latqgqkkemsctahgekpiivrwekedriinpemarylvstkevgeevistlqilptvr 857 


FT 


Region 


507.. 603 




•516 PDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPUVNVTDTEV 575 


FT 




•/label- lg6 


Qy 


FT 


Region 


■'604., 697 




1:1 ::M l^ll III:: | | : | :| : 


FT 


'/label- Ig7 


Db 




FT 


Region 


.698. .792 






FT 




'/label- lg8 


Qy 


576 ELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNI PDYVASTEYRIKGLKPSHS 628 


FT 


Region 


793. .887 




1 1 1 lllll h : :| : I ::| | : II : 


FT 




■/label- Ig9 


Db 


902 tlrwtm-gfdgnspitgydie-cknksdswdsaqrtkdvspqlnsat— iidihpsst 955 


FT 


Disulfide-bond 


46. .102 




FT 


Disulfide-bond 


145.. 197 


Qy 


629 YMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEEV 688 


FT 


Disulfide-bond 


246.. 293 




1 : hi ll 1: 1 1: II I : II 


FT 


Disulfide-bond 


'335.. 385 


Db 


956 ysirmyaknr--i&ksepsneltitadeaapdg ppqevhle-- 994 


FT 


Disulfide-bond 


428.. 484 




1; 


FT 


Disulfide-bond -525.. 575 


Qy 


689 KTINSTAVRLFWKpKRKLEE-LIDGYYIKWRGPPRTNDNQYVNV TSPSTENYWS 741 


FT 


Disulfide-bond 


•617,. 669 




hi ::h Mi h h :l II 1 :| II h II :| 1 : 


FT 


Disulfide-bond .711.. 766 


Db 


995 -pissqsirvtwkapkkhlqngiirgyqigyr-eystggnfqfnilsvdtsgdsevytld 1052 


FT 


Disulfide-bond 


809.. 865 




t' 


FT 


Disulfide-bond 


■1307. .1359 


Qy 


742 NLMPFTNYEFFVIEYHSGVHSIHGAPSNSMDVLTA-EAPPSLPPEDVRIRMLNLITLRI 799 


FT 


Modified-site 


• 78. .80 




II III 1 f : | :| :::| I II |||:|: : :: 1 


FT 




./note- "Asn is N-glycosylated" 


Db 


1053 nlnkftqyglwqacnra gtgpssqeiitttledvpsyppenvqaiatspesisi 1107 


FT 


Modified-site 


'106.. 108 






. FT 




."/note- "Asn is N-glycosylated" 




800 SWKAPKADG ING ILKGFQ IVIVGQAPNNN — RNITTNERAASVTLFHLVTGMT YKIRV 855 


FT 


Modified-site 


470.. 472 


I 


II : : 1 1 1 T : 1 1 : : : : :|||l : h 1 1 1 hi 


FT 




./note- "Asn is N-glycosylated" 




1108 swstlskealngilqgfrviywanlmdgelgeiknitttq-psleldglekytnysiqv 1165 


FT 


Modified-site 


•487.. 489 






FT 




i/note- "Asn is N-glycosylated" 


Qy 


856 AARSNGGVGV 865 


FT 


Modified-site 


658., 660 




1:111 


FT 




/note- "Asn is N-glycosylated" 


Db 


1166 laftragdgv 1175 


FT 


Modified-site 


'666,. 668 






FT 




•/note- "Asn is N-glycosylated" 






FT 


Modified-site 


■710.. 712 


RESULT 12 


FT 




. /note- "Asn is N-glycosylated" 


W42086 


FT 


Modified-site 


748., 750 


ID 


W42086 standard; Protein; 1910 AA. 


FT 




/note- "Asn is N-glycosylated" 


XX 




FT 


Modified-site 


795., 797 


AC 


W42086; 


FT 




'/note- "Asn is N-glycosylated" 


XX 




FT 


Modified-site 


924.. 926 


DT 


28-SEP-1998 (first entry) 


FT 




•/note- "Asn is N-glycosylated" 


XX 




FT 


Modified-site 


1142,, 1144 ' 


DE 


Human Down syndrome-cell adhesion molecule DS-CAMl. 


FT 




./note- "Asn is , N-glycosylated" 


XX 




FT 


Modified-site 


•1160.. 1162 


RW 


DS-CAM1; Down syndrome -cell adhesion molecule; neural cell; 


FT 




/note- "Asn is N-glycosylated" 


KW 


signal transduction; trisomy 21; mental retardation; 


FT 


Modified-site 


1250.. 1252 


KW 


holoprosencephaly; corpus callosum agenesis; 


FT 




/note- "Asn is N-glycosylated" 


KW 


schizencephaly; diagnosis; assay; human. 


FT 


Modified-site 


-.1271.. 1273 


XX 




FT 




/note- "Asn is N-glycosylated" 



Best Available Copy 

Mon Jan 22 13:04:31 2001 us-09-540-245a-17.rag Page 13 



DR 
XX 
PT 
PT 
PT 
XX 
PS 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

i 



Modified-site 1324.. 1326 

/note- "Asn is N-glycosylated" 
Modified-site 1341.. 1343 

/note- "Asn is N-glycosylated" 
Modified-site 1488., 1490 

/note- "Asn is N-glycosylated" 

W09817795-A1. 



23-OCM997; 97WO-US19547 . 
25-OCM996; 96DS-0029322. 
(CEDA-) CEDARS SINAI MEDICAL CENT. 

Korenberg JR; 

WPI; 1998-271791/24. 
N-PSDB; V31981. 

New isolated Down's Syndrome -cell adhesion molecule - used to 
develop products for detection, diagnosis and therapy of 
developmental and neurological abnormalities 

Claim 2; Page 73-78; 109pp; English. 

This polypeptide comprises Down syndrome-cell adhesion molecule 
DS-CAM1, a cell surface glycoprotein belonging to a novel subclass 
of the Ig superfamily with highest homology to neural cell adhesion 
molecules. Its amino acid sequence was deduced from cDNA clones 
(see V31981) isolated from a trisomy 21 foetal brain library. A 
splice variant, DS-CAM2 (see W42087), which is non-membrane bound 
was also identified. The invention also provides human and murine 
DS-CAM nucleic acid sequences (see also V319B5-88), expression 
vectors and host cells, transgenic animals, antibodies, antisense 
oligonucleotides, and primers derived from DS-CAM nucleic acid. 
DS-CAM polypeptides are associated with developmental and 
neurological processes. They can be used in e.g. neural prosthetic 
devices used in entubulation methods of repairing (regenerating) 
damaged or severed peripheral nerves, and also in bioassays to 
identify agonists and antagonists. The products can also be 
used in detection, diagnosis and therapy of developmental and 
neurological abnormalities such as Down syndrome, mental 
retardation, holoprosencephaly, agenesis of the corpus callosum, 
or schizencephaly. 



SO Sequence 1910 AA; 



Query Match 



8.7%; Score 598; DB 19; Length 1910; 



Best Local Similarity 25.2%; Pred. No. 2,5e-28; 



Matches 


230; Conservative 131; Mismatches 365; Indels 188; G 


ps 41; 


Qy 


16 


YINFDKIPNASNLAPVIIEHPI DYWSRGSPATLNCGAKPS -TAKITWYKDG 
:: II: :| : |::| : III MM: I ! 
fvrkdkl-saqdyvqwledgtpkiisafsekvvspaepvslmcnvkgtplptitwtldd 


66 


Db 


386 


444 


Qy 


67 


QPVITNKEQVNSHRI- ■ -VLDTGSLF-LLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNE 

1:: Mil : 1:: Ml : 1 1 1 1 hi II 
dpilkg--"gshrisqmitsegnvvsylniss---sqvrdggvyrctannsagw---- 


122 


Db 


445 


493 


Qy 


123 


GSLKLAMLREDFR— -VRP-RTVQALGGEMAVLECSPPRGFPEPWSWRRDDKELRIQD 

11:1 :|| : : 1: 1 : 1 |:| : ||: I 
-■--lyqarinvrgpasirpmknitaiagrdtyihcr-vigypyysikwyknsnllpfnh 


177 


Db 


494 


548 


Qy 


t 

178 


MPRYTLHSDGNLJ IDPVDRS - DSGTYQCVANNMVGERVSN PARLSVFEKP 

: ::| f: I : I I I I I :| ::| MM 
r-qvafenngtlklsdvqkevdegeytc--nvlvqpqlstsqsvhvtvkvppfiqpfefp 


226 


Db 


549 


605 


Qy 


227 


KFEQEPKDMTVDVGAAVLFDC-RVTGDPQPQITWKRKNEPMPVTRAYIAKDN- - - -RGLR 


281 



: I I 1 : 1 1 



Db 


606 


Qy 


282 


Db 


656 


Qy 


342 


Db 


716 


Qy 


400 


Db 


774 


Qy 


460 


Db 


794 


Qy 


512 


Db 


854 


Qy 


572 


Db 


898 


Qy 


625 


Db 


952 


Qy 


685 


Db 


993 


Qy 


738 


Db 


1049 


Qy 


796 


Db 


1104 


Qy 


852 


Db 


1162 



■-sigqrvfipcvvvsgdlpititwqkdgrpipgslg-vtidnidftsslr 655 



I II III I :| : I I: II Ml II I 



:| .hi I: I : I I hi II 



II : 
- -myltvki - - 



I : : 
--pamitsy 793 



: I I I: 



: hi ::| I' I II 



III:: 



■-ITDSRISQHSTGSLHIA 511 

:: : : :| I 



MM:! 
■-ppdppeieikdvk 897 



I lllll h 



:| : 



■-PDYVASTEYRIKGLK 624 
I ::| I : 



II : : hi II h I h II 



II hi ::h II h h II h 



---ppqevh 992 

--TSPSTEN 737 

II :l 



I : II II I I 



:MI|:||:: 



I :l :::! I II llhh : 
■-gtgpssqeiitttledvpsyppenvqaiatspe 1103 



h I I 



hi I : 



RESULT 13 
R63759 
ID R63759 standard, 
XX 

AC R63759; 
XX 



Protein; 1018 AA, 



10-MAY-1995 (first entry) 

Human contactin" (EMBL Accession IZ21488). 

Human contactin; human brain glycoprotein; neural cell adhesion; 
mouse F3, chicken contactin/Fll adhesion molecules. 



Key 

Disulfide-bond 
Disulfide-bond 
Disulfide-bond 
Disulfide-bond 
Disulfide-bond 
Disulfide-bond 
Domain 



Location/Qualifiers 

-,.94 

8.. 191 

3. .290 

2 . . 371 

.6. .464 
106..563 
104 . . 657 
/label- FLR 

/note- "Conserved core of fibronectin type III 
like repeat" 



Mon Jan 22 13:04:31 2001 



us-09-540-245a-17.rag 
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i 



FT Domain 707., 760 

FT /label- FLR 

FT /note- "Conserved core of fibronectin type III 

FT like repeat" 

FT Domain 809.. 857 

FT /label- FLR 

ft /note- "Conserved core of fibronectin type III 

FT like repeat" 

FT Domain 905., 952 

FT /label- FLR 1 

FT /note- "Conserved core of fibronectin type III 

FT like repeat" 

FT Modified-site 188 

ft /label- ASN_glycos , 

FT /note- "Potential site for ASN-linked 

FT glycosylation" 

FT Modified-site 238 

FT /label- ASN_glycos \ 

FT /note- "Potential site for ASN-linked 

FT glycosylation" 

^T Modified-site 318 

/label- ASN.glycos , 

^ /note- "Potential site for ASN-linked 

FT glycosylation" 

FT Modified-site 437 

FT /label- ASN.glycos 

FT /note- "Potential sitie for ASN-linked 

FT glycosylation" 

FT Modified-site 453 ' 

FT /label- ASN.glycos I 

FT /note- "Potential site for ASN-linked 

FT glycosylation" 

FT Modified-site 474 

FT /label- ASN.glycos 

FT /note- "Potential site for ASN-linked 

FT glycosylation" 

FT Modified-site 501 

FT /label- ASN.glycos 

FT /note- "Potential site for ASN-linked 

FT glycosylation" 

FT Modified-site 571 

FT /label- ASN.glycos 

FT /note- "Potential site for ASN-linked 

FT glycosylation" 

FT Modified-site 913 

FT /label- ASN-glycos 

ft /note- "Potential site for ASN-linked 

FT glycosylation" 

FT Peptide 1..20 

« /label- Sig_peptide 
EP618293-A. 

XX 

PD 05-OCT-1994. 
XX 

PF 23-MAR-1994; 94EP-0104580. 

PR 26-MAR-1993; 93tJS-QQ4Q741. 
XX 

PA (BECT ) BECTON DICKINSON CO. 
XX 

PI Hemperly jj, Reid RA; 
XX 

DR WPI; 1994-304462/38. 

DR N-PSDB; Q74440. 
XX 

PT Human contactin glyco: protein - homologous to mouse F3 and 

PT chicken contactin/Fll adhesion molecules 

XX 

PS Claim 2; Page 15; 34pp; English. 
XX 

CC Q74440 is the human brain glycoprotein contactin (HBGC) cDNA, which 

CC encodes the amino acid sequence described in R63759. This protein 



CC is homologous to mouse F3 and chicken contactin/Fll adhesion 

CC molecules; all three are involved in neural cell adhesion. An 

CC antibody to HBGC can be used in an immunoassay to detect HBCG in 

CC samples, and the cDNA can be used as a probe to detect a sequence 

CC encoding HBGC in samples. 
XX 

SQ Sequence 1018-.AA; 



Query Match 8.6%; Score 593; DB 15; Length 1018; 

Best Local Similarity 23.5%; Pred, No. 2.2e-28; 

Matches 246; Conservative 126; Mismatches 361; Indels 316; Gaps 36; 

Qy 30 PVIIEHPIDWVSRGS— PATLNCGAKPSTAKI-TWYKDGQPVITNKEQVNSHRIVLDT 85 

I: I II:!- I :IM hi :" I : I : I I : 
Db 41 pifeeqpintiypeeslegkvslncraraspfpvykwrmnngdv dltsdrysmvg 95 



Qy 86 GSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAML — 

I : I : •: I III 111:111 :| MM I I : ||: 
Db 96 gnlvi nnpdkqkdagiyyclasnnygmvrsteatlsfgyldpfppeerpevrvke 150 

Qy 139 RTVQALGGEMAVLECSPPRGFPEPV-VSWRKDDKELRI-QDMPRYTLHSDGNLIIDPVDR 196 

I: II I II II: : I :: : I I |: ::||| I |: 
Db 151 gkgmvllcdppyhfpddlsyrwllnefpvfitmdkrrfvsqtngnlyianvea 203 

Qy 197 SDSGTYQCVAHNMVGERVSNPA-RLSVFEK PKFEQEP KDMTVDV 239 

II I I I ; 1 1 : 1 : III I I: :| lh : 

Db 204 sdkgnyscf-- vsspsitksvfskfiplipiperttkpypadiwqfkdvyalm 255 

Qy 240 GAAVLFDCRVTGDPQPQITWKRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARN 299 

I I :| hi I I h: MM I I h hi M III I I I I 
Db 256 gqnvtlecfalgnpvpdirwrkvlepmpst-aeistsgavlkifniqledegiyeceaen 314 

Qy 300 PAGTLEASAHLRVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLF 359 

I : I : III I : III : I hi I I I I 
Db 315 irgkdkhqariyvqafpewvehindtevdigsdlyvpcvatgkpiptirwlkngy 369 

Qy 360 PSYVSADGRTKVSPTGTLTIEEVRQMGAYVCAGMNSAGSSLSKMLK ATFETK 414 

; I I : :| : I I I |: |: : | M Ml 
Db 370 ayhkgelrlydvtfenagmyqciaentygaiyanaelkilalaptfe-- 416 

Qy 415 GRVQKKKSKMGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPC 474 

I. hi :::l I h I 

Db 417 mnpmkkk ilaa kggrviiec 436 

Qy 475 QASGKPTPGISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTW 534 

: I I ' II : I : Ml Ml I :: : I hill hi M 
Db 437 kpkaapkpkfswsk-gtewlvnssriliwedgsleinnitrndggiytcfaennrgkans 495 

Qy 535 SASLTVEDHT SNAQFVRMPDPS 556 

: : I : t I - : I lh 

Db 496 tgtlvitdptriilapinaditvgenatmqcaasfdpaldltfvwsfngyvidfnkenih 555 

Qy 557 --NF PSSPTQPIIVNV 570 

II : III:: 
Db 556 yqrnfmldsngellirnaqlkhagrytctaqtivdnssasadlvvrgppgppgglriedi 615 

Qy 571 TDTEVELHWNAPSTSGAGPITG Y I IQ - - - YYS PDLGQTWFNIPDYVASTE • YRIKGLKPS 626 

I I I h I : : lh I II II : I : I I I I 
Db 616 ratsvaltwsrgsdnhs-piskytiqtktilsddwkdaktdppiiegnmeaaravdlipw 674 

Qy 627 HS YMFVI RAENEKG IGT PSVSSALVT T SK PAAQVALSD 664 

I I : I I I Mh I : I Mill 
Db 675 meyefrwatntlgrgepsipsnriktdgaapnvapsdvgggggrnreltitwaplsrey 734 

Qy 665 "KNKMDMAIAEKRLTSEQLIKLE 686 

I Ml: |: 
Db 735 hygnnfgyivafkpfdgeewkkvtvtnpdtgryvhkdetmspstafqvkvkafnnkgdgp 794 



Qy 687 
Db 795 



EVKTINSTAVRLFWKKRRLEELIDGYYIK-WRGPPRTND 724 
II ::h : : I : l|:::: MM : 
ivgvkvlssseisvhwehvlekivesyqirywaahdkeea 853 



1 Best Available Copy 
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Page 



Qy 


725 NQYVNVTSPSTENYV--VSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSL 782 


FT 


Mi fl-f-ril f f Qvai\ro 91 1 
ruoL. uiiiciciJwc *n 




1 III 


: 1 


li:| II 1 :l : 1 II: :: 1 :|||| 


FT 


' /not©" "conssrvsd Cys in dorndin" 


Db 


854 anrvqvts- 


--qeysarlenllpdtqyfievgacns—agcgppsdmieaftkkappsq 907 














FT 


/notfi™ "consGrvsd Cys in Icj dorn&in tt 


Qy 


783 PPEDVRIRMLNLTTLR- 


- — ISWKAPKADGINGILKGFQIVIVGQAPNNNRNITTNER 836 




KM op-HI f f orange' llfl 




II II 


::::! 


1:1 | : |:::: :: : :|:: 


FT 


/not6" "conssrvsd Cys in 19 domain" 


Db 


908 pp— ri- 


--issvrsgsryiitwdhvvalsnestvtgykvlyrpdgqhdgklysthkh 960 


FT 


Misc*dif f crcncG 352 










FT 


/ijutu LUiJDcivcu ty& 111 ly uuiiiaiii 


0y 


837 AASVTLFHLVTGMTYKIRVAARSNGGVGV 865 


FT 


Misc-difference 392 




: 1 : 


1 : 


1 1 M || 


FT 


' /note- "conserved Cys in Ig domain" 


Db 


961 sievp---iprdgeywevrahsdggdgv 986 


FT 


Misc-difference 436 












' /note- "conserved Cys in Ig domain" 










FT 


Misc-difference 484 


RESULT 14 






FT 


/note- "conserved Cys in Ig domain" 


R87 


)28 






FT 


Misc-difference' 526 


ID 


R87028 standard; Protein; 1018 AA. 


FT 


V n ote- "conserved Cys in Ig domain" 


XX 

i 
r 


R87028; 






FT 
FT 
FT 


Misc-difference 583 

/note- "conserved Cys in Ig domain" 
Misc-difference' 624 


CT 


23-APR-1996 (first entry) 


FT 


/note- "conserved Trp in FNIII domain" 


XX 








FT 


Misc-difference 727 


DE 


Human contactin. 




FT 


'/note- "conserved Trp in FNIII domain" 


XX 










Misc-difference* 829 


KW 


Contactin; neuron cell 


adhesion molecule; 


FT 


/note- "conserved Trp in FNIII domain" 


KW 


neurite outgrowth promoter; neuron regeneration promoter; treatment; 


FT 


Misc-difference 925 


KW 


gene therapy. 








/nflfOB "Mneorvori TVn. In PMTTT domain" 


XX 








FT 


Ml an-rll f f OT*onno (177 


OS 


Homo sapiens. 






FT 


■/note- "conserved Tyr in FNIII domain" 


XX 










Misc-dif ference 877 


FH 


Key 


Location/Qualifiers 


FT 




FT 


Peptide 


1..20 




FT 


Misc-difference - 972 


FT 




/note- 


"contactin signal sequence" 


FT 


/note- "conserved Tyr in Ig domain" 


FT 


Protein 


32.. 1018 


FT 


Misc-difference 780 


FT 




/note- 


"contactin-2" 


FT 


, /nQt^g" "conserved Phe in Ig domain" 


FT 


Protein 


21,. 1001 




Misc-dif ference' 993 


FT 


i 


/note- 


"contactin-1; 


FT 


anchor attachment site" 


FT 






contactin-2 lacks residues 21 to 31" 


FT 


FT 


Protein 


21 . .992 


FT 


Region 602. .609 


FT 




/note- 


"soluble contactin-1" 


FT 


■ /note- "possible hinge" 


FT 


Protein 


32. .1001 t 




Region 993.. 1018 


FT 




/note- 


"contactin-2" 


FT 


/note- "GPI-anchor" 


FT 


Protein 


32. .992 


XX 




FT 




/note- 


"soluble contactin-2" 


PN 


W09535373-A2. • 


FT 


Domain 


1..20 




XX 








/note- 


"hydrophobic domain" 


PD 


28-DEC-1995. 




Domain 


1002.. 1018 


XX 




1 




/note- 


"hydrophobic domain" 


PF 


09- JUN-1995; 95WO-OS07408. 


F T 


Modified-site 


208 




XX 








/note- 


"putative N-linked glycosylation site" 


PR 


10-JON-1994; 94OS-0258022. 


FT 


Modified-site 


258 




XX 




FT 




/note- 


"putative N-linked glycosylation site" 


PA 


(LJOL-) LA JOLL'A CANCER RES FOOND. 


FT 


Modified-site 


338 




XX 




FT 




/note- 


"putative N-linked glycosylation site" 


PI 


Berglund EO, Ranscht B; 


FT 


Modified-site 


457 




XX 




FT 




/note- 


"putative N-linked glycosylation site" 


DR 


WPI; 1996-05841D/06. 


FT 


Modified-site 


473 




DR 


N-PSDB; T07313.' 


FT 




/note- 


"putative N-linked glycosylation site" 


XX 




FT 


Modified-site 


494 




PT 


Recombinant human contactin, opt. in sol. form - promotes neurite 


FT 




/note- 


"putative N-linked glycosylation site" 


PT 


outgrowth in culture and promotes neuron regeneration in e.g. 


FT 


Modified-site 


521 




PT 


damaged spinal cord, retina, cerebellum and cerebral cortex. 


FT 




/note- 


"putative N-linked glycosylation site" 


XX 


FT 


Modified-site 


591 




PS 


Claim 3; Fig 1A-C; 48pp; English. 


FT 




/note- 


"putative N-linked glycosylation site" 


XX 




FT 


Modified-site 


630 




cc 


Human contactin can be used in vitro to promote neurite extension 


FT 




/note» 


"putative N-linked glycosylation site" 


cc 


in culture, and' in vivo to promote neuron regeneration of the 


FT 


Modified-site 


933 




cc 


spinal cord, retina, cerebellum and cerebral cortex. Transformed 


FT 




/note- 


"putative N-linked glycosylation site" 


cc 


neuronal precursor cells and glial cells or fibroblasts can also 


FT 


Misc-difference 65 




cc 


promote neuron regeneration when administered to a site of damage. 


FT 




/note- 


"conserved Cys in Ig domain" 


XX 




FT 


Misc-difference 114 




SQ 


Sequence 1018 AA; 


FT 




/note- 


"conserved Cys in Ig domain" 




FT 


Misc-difference 158 






FT 




/note- 


"conserved Cys in Ig domain" 


Query Match 8,6*; Score 593; DB 17; Length 1018; 
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Best Local Similarity 23.5%; Pred. No. 2.2e-28; 

Matches 246; Conservative 126; Mismatches 361; Indels 316; Gaps 36; 

Oy 30 PVIIEHPIDWVSRGS— PATLNCGAKPSTAKI-TWYRDGQPVITNKEQVNSHRIVLDT 85 

I: III:: I :lll I: I : I : I : I I : 
Db 41 pifeeqpintiypeeslegkvslncraraspfpvykwrmnngdv dltsdrysmvg 95 

Qy 86 GSLFLLWNSGKNGKDSDAGAYYCVASNEHGEVRSNEGSLRLAML REDFRVRP 138 

1:1 : I III llhlll :| hi I :| I I : II: 

Db 96 gnlvi nnpdkqkdagiyyclasnnygmvrsteatlsfgyldpfppeerpevrvke 150 

Qy 139 RTVQALGGEMAVLECSPPRGFPEPV-VSWRKDDKELRI-QDMPRYTLHSDGNLIIDPVDR 196 

I: II I II II: : I :: : I I |: Mil I |: 
Db 151 gkgmvllcdppyhfpddlsyrwllnefpvfitmdkrrfvsqtngnlyianvea 203 

Qy 197 SDSGT YQCVANNMVGERVSNPA - RLSVFEK PRFEQEP KDMTVDV 239 

II I II 11:1: HI | |: :| i|: : 

Db 204 sdkgnyscf vsspsitksvfskfiplipiperttkpypadiwqfkdvyalm 255 

Oy 240 GAAVLFDCRVTGDPQPQITWRRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARN 299 

(II: 1:1 I I I:: Mil I I |: |:| :| III I I I I 
b 256 gqnvtlecfalgnpvpdirwrkvlepmpst-aeistsgavlkifniqledegiyeceaen 314 
uy 300 PAGTLEASAHLRVQAPPSFQTRPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLF 359 

I : I : III I : III : I |:|| III 
Db 315 irgkdkhqarivvqafpewvehindtevdigsdlywpcvatgkpiptirwlkngy 369 

Qy 360 PSYVSADGRTRVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALK— --ATFETK 414 

II:: fill I: I: : I II III 
Db 370 ayhkgelrlydvtfenagmyqciaentygaiyanaelkilalaptfe-- 416 

Qy 415 GRVQKKKSRMGKQRQRNVQSIIKYLIBAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPC 474 

I 1:1 :: I I I: I 

Db 417 mnpmkkk ilaa kggrviiec 436 

Qy 475 QASGRPTPGISWLRDGLPIDITDSRISQHSTGSLHIADLRRPDTGVYTCIARNEDGESTW 534 

: 1111:1 : III III I :: : I Mil |:| I:: 
Db '437 kpkaapkpkfswsk-gtewlvnssriliwedgsleinnitrndggiytcfaennrgkans 495 

Qy 535 SASLTVEDHT-" SNAQFVRMPDPS 556 

: :| : I I I II: 

Db 496 tgtlvitdptriilapinaditvgenatmqcaasfdpaldltfwsfngyvidfnkenih 555 

Qy 557 — NF ■ -\ PSSPTQPIIVNV 570 

II ' j III:: 

Db 556 yqrnfmldsngellirnaqlkhagrytctaqtivdnssasadlwrgppgppgglriedi 615 

Qy 571 TDTEVELHWNAPSTSGAGPITGYIIQ— YYSPDLGQTWFNIPDYVASTE-YRIKGLRPS 626 

I I I I: I :: II: I II II : I Mil 

Kb 616 ratsvaltwsrgsdnhs-piskytiqtktilsddwkdaktdppiiegnmeaaravdlipw 674 
y 627 HSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSD 664 

II : I I I Ml: I : I I II II 

Db 675 meyefrwatntlgrgepsipsnriktdgaapnvapsdvgggggrnreltitvaplsrey 734 

Qy 665 ■ • KNKMDMAIAEKRLTSEQLIRLE .* 686 

I :| I I: I: 
Db 735 hygnnfgyivafkpfdgeewkkvtvtnpdtgryvhkdetmspstafqvkvkafnnkgdgp 794 

Qy 687 EVKT INSTAVRLFWKRRRLEELIDGYYIR - WRGPPRTND 724 

II ::h : : I : II":: I I: I : 
Db 795 yslvavinsaqdapseaptevgvkvlssseisvhw-ehvlekivesyqirywaahdkeea 853 

Qy 725 NQYVNVTSPSTENYV--VSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSL 782 

I III : I : Ml II I :| : I II: :: I MM 
Db 854 anrvqvts - - -qeysarlenllpdtqyf ievgacns - - -agcgppsdmieaf tkkappsq 907 

Qy 783 PPEDVRIRMLNLTTLR ISWKAPKADGINGILRGFQIVIVGQAPNNNRNITTNER 836 

II II ::::! |:| I : |:::: :: : :|:: 
Db 908 pp---ri — issvrsgsryiitwdhvvalsnestvtgykvlyrpdgqhdgklysthkh 960 

Qy 837 AASVTLFHLVTGMTYRIRVAARSNGGVGV 865 

: I : MM Ml II 



Db 961 sievp---iprdgeywevrahsdggdgv S 



RESULT 15 
W74152 

ID W74152 standard; Protein; 1257 AA. 
XX 

AC W74152; 



05-MAY-1999 (first entry) 
Human LI cell adhesion molecule. 



Human LI cell adhesion molecule; LlCAM; neurite growth; 
nervous system development; nerve regeneration; 
neuronal cell cohesive interaction, 



18-NOV-1994; 94US-0341843 . 



26-JDN-1992; 
18-NOV-1994; 



•92DS-0904991. 
94OS-0341843. 



(OYCA-) UNIV CASE WESTERN RESERVE. 



WPI; 1999-166719/14. 
N-PSDB; X01598.- 

Human LI cell adhesion molecule ■ supports neurite outgrowth and is 
involved in nervous system development and repair 

Claim 1; Fig 3; 45pp; English. 

This sequence is the human LI cell adhesion molecule (LlCAM) of the 
invention. LlCAM supports growth of neurites in vitro and is involved in 
development of the human nervous system and in nerve regeneration, It is 
useful in in vivo and in vitro experiments on nerve growth and 
regeneration. LlCAM mediates cohesive interactions of neuronal cells to 
each other and to extracellular matrix. 

Sequence 1257 iAA; 



Query Match 8,5%; Score 583.5; DB 20; Length 1257; 

Best Local Similarity 22,9%; Pred. No, l.le-27; 

Matches 243; Conservative 141; Mismatches 357; Indels 321; Gaps 42; 

Qy 30 PVI I EH- P IDVWSRGSP ATLNCGA- -RPSTARITWYKDG- • -QPVITNREQV N 77 

III MM : I I I II : I M :| M : 
Db 35 pviteqsprrlwfptddislkceasgkpe-vqfrwtrdgvhfkp — keelgvtvyqs 89 

Qy 78 SHRIVLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVR 137 

I : 1 1 : II I III III: I |:| : :: I 
Db 90 ph sgsf titgnns - -nf aqrf qgiyrcf asnklgtamshe — irlmaegapkw 138 



Qy 138 P RTVQALGGEMAVLECSPPRGFPEPWSWRRDDKELRIQDMPRYTLHSDGNLIIDP 19: 

I : I: II II M ||: : : I I |: I |: M 
Db 139 pketvkpveveegesvvlpcnppps-aeplriywmnskilhikqdervtmgqngnlyfan 19' 

Qy 194 VDRSDS-GTYQCVANNMVGERVSNPARLSVFEKPKFEQEPRDMTVDV 23< 

I II: Ml: I :: :l II I: I 
Db 198 vltsdnhsdyichah fpgtrtiiqk epidlrvkatnsmidrkprllf 24 

Qy 240 --GAAVLFDCRVTGDPQPQITWRRRNEPMPVTRAYIAKDNRGLRIERVQP 281 

■ I :: :l M M II : III I IM:: :| 
Db 245 ptnssshlvalqgqplvleciaegfptptikwlrpsgpmpadrvtyqnhnktlqllkvge 30' 
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Qy '288 SDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAY 347 

hill MM: : :: 1:1 I : II II :| : |:| 

Db 305 eddgeyrclaenslgsarhayyvtveaapywlhkpqshlygpgetarldcqvqgrpqpev 364 

Qy 348 FWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAAL 407 

II : :: I : :: I I : |: I I I I |: I : 
Db 365 tvring — ipveelakdqkyriq-rgalilsnvqpsdtmvtqcearnrhglllanayi 419 

Qy 408 RATFETKGRVQKRKSKMGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMV- 466 

I:: III III I 

Db 420 yw qlpakilta dnqtymav 439 

Qy 467 -GSSAILPCQASGRPIPGISWL-RDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYICI 524 

1:1 I 1:1 I I I : II II : : I I |:| I II: III I |: 
Db 440 qgstayllckafgapvpsvqwldedgttv-lqderffpyangtlgirdlqandtgryfcl 498 

^y 525 AKNEDGESTWSASLTVEDHTSNAQFVRMP DPSKFPS 560 

■ I I: I !:| Ml II III II 

Pb 499 aandqnnvtimanlkvkdatqitqgprstielckgsrvtftcqasfdpslqpsitwrgdgr 558 

Qy 561 SPTQPIIVN 569 

I I ::| 

Db 559 dlqelgdsdkyfiedgrlvihsldysdqgnyscvasteldwesraqllwgspgpvprl 618 

Qy 570 VTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLG-QTWFN— IPDYVASTEY 618 

:| y.\ : I :|: II I |:: :: : |:: :| II 
Db 619 vlsdlhlltqsqvrvsv-spaedhnapiekydiefedkemapekwyslgkvpgnqtsttl 677 

Qy 619 RIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMD 669 

: M I I : I I: I I II I M : I M :| 
Db 678 k- ■ -lspyvhytf rvtainkygpgepspvsetwtpe aapeknpvdvkgegnett 729 

Qy 670 -MAIAEKRLT SEQLI 683 

I I II II:: 
Db 730 nmvitwkplrradwnapqvqyrvqwrpqgtrgpwqeqivsdpflvvsntstfvpyeikvq 789 

Qy 684 KLEEVKT INST AVRLFWRKRKLEEL ■ • • IDGYY 713 

:|| :: :||:|| : |: I :: : II 
Db 790 avnsqgkgpepqvtigysgedypqaipelegieilnssavlvkwrpvdlaqvkghlrgyn 849 

Qy 714 IK-WR-GPPRTNDNQYVN — VTSPSTENYVVSNLMPFTNYEFFVIPYHSGVHSIHGAP 767 

■ : II I I : :::: I :| : ::| I I::: II I : :| 
Db 850 vtywregsqrkhskrhihkdhwvpanttsvilsglrpyss yhlevqafngrg 902 

^y 768 SNSMDVLTAEAPPSLP--PEDVRIRMLNLITLRISWKAPKADGINGILKGFQIVIVGQAP 825 

■ I I I :| II : : : |:| : |: I : l|:| |: : 

pb 903 sgpaseftfstpegvpghpealhlecqsntslllrwqpplsh-ngvltgyvlsyhplde 960 

Qy 826 NNNRNITTNERAASV— TLFHLVTGMTYKIRVAARSNGGVG 864 

:: I I : II : |: :: I : I | 
Db 961 ggkgqlsfnlrdpelrthnltdlsphlryrfqlqattkegpg 1002 
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GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 



January 22, 2001, 12:25:18 ; Search time 325.28 Seconds 
(without alignments) 
270.743 Million cell updates/sec 

US-09-540-245A-17 
6860 

1 MYYLGFYHTHTHTHTYINFD TAQRFRS I PRNNG I VTQEQT 1297 



Scoring table: BLOSUM62 



Inched: 



Gapop 10.0 , Gapext 0.5 
195891 seqs, 67900655 residues 



Total number of hits satisfying chosen parameters: 195891 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post -processing: Minimum Match 0* 
Maximum Match 100% 
Listing first 45 summaries 



Database : 



PIR_66:* 
pirl:« 
pir2:» 
pir3:« 
pir4:» 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution , 



% 

Query 



NO. 


Score 


Match Length 


DB 


ID 


Description 


Query Match 95.1%; Score 6523.5; DB 2; Length 1273; 
Best Local Similarity 97.3%; Pred. No, 0; 
















1 


6523.5 


95.1 


1273 


2 


T42405 


sax- 3 protein • Ca 


Matches 1243; Conservative 1; Mismatches 1; indels 33; Gaps 


2 


4628 


67.5 


874 


2 


T29548 


hypothetical prote 








2232 


32.5 


423 


2 


T29549 


hypothetical prote 


Qy 


24 NASNLAPVIIEHPIDVWSRGSPATLNCGAKPSTAKITWYKDGQPVITNKEQVNSHRIVL 83 




1505.5 


21.9 


1612 


2 


T30805 


duttl protein • mo 


:!!!l!l!!llimillllllll!llllllll!l!!!lllll!l!!ll!l!l!!!!!ll 


5 


1483.5 


21,6 


1651 ,2 
1344 ,2 


T14160 


transmembrane rece 


Db 


25 DASNIjAPVIIEHPIDWVSRGSPATLNCGAKPSTAKITWYKDGQPVITNKEQVNSHRIVL 84 


6 


1323 


19.3 


T14316 


rig-1 protein - mo 






7 


635 


9.3 


1028 \2 


158164 


BIG-1 protein - ra 


Qy 


84 DTGSLFLLRVNSGKNGRDSDAGAYYCVASNEHGEVRSNEGSLRIAMLREDFRVRPRTVQA 143 


8 


632,5 


9.2 


1443 


2 


150600 


neogenin * chicken 




IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIMIIIIIIIII 


9 


616 


9.0 


1028 


2 


A53449 


plasmacytoma-assoc 


Db 


85 DTGSLFLLRVNSGRNGRDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQA 144 


10 


607 


8.8 


1375 


2 


T13822 


frazzled gene prot 






11 


598 


8.7 


1896 


2 


T08851 


Down syndrome cell 


Qy 


144 LGGEMAVIECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQ 203 


12 


597 


8.7 


1010 


2 


JO0094 


Fll protein precur 






13 


595 


8.7 


1020 


2 


S05944 


neuronal cell surf 


Db 


145 LGGEMAVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQ 204 


14 


595 


8.7 


1021 


2 


A57U2 


contactin precurso 






15 


593.5 


8.7 


1091 


2 


S01998 


contactin precurso 


Qy 


204 CVANNMVGERVSNPARLSVFEKPRFEQEPRDMTVDVGAAVLFDCRVTGDPQPQITWKRRN 263 


16 


593 


8.6 


1018 


2 


A54744 


contactin 1 precur 




llll!!l!IH'lllllllllllllllll!llll!ill!ll!l!l!lllll!lllllllll 


17 


592 


8.6 


1018 


2 


JC4211 


neural adhesion pr 


Db 


205 CVANNMVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKRRN 264 


18 


589 


8.6 


1272 


2 


S26180 


neurofascin - chic 






19 


588 


8.6 


2222 


2 


T13924 


sdx protein - frui 


Qy 


264 EPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARHPAGTLEASAHLRVQAPPSFQTKPA 323 


20 


■ 585 


8.5 


1040 


2 


A34695 


axonal glycoprotei 




IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIII 


21 


583.5 


8,5 


1257 


1 


A41060 


neural cell adhesi 


Db 


265 EPMPVTRAYIARDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTRPA 324 


22 


582.5 


8.5 


1040 


2 


A49356 


transient axonal g 






23 


580.5 


8.5 


1240 


2 


T03097 


CDO protein • huma 


Qy 


324 DQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVR 383 


24 


580.5 


8.5 


1427 


2 


151669 


tumor suppressor • 




IIIIIIIIIIIMIIIIIIMIIIIIimillMMIIIIIIIIIIIIIIIIIIIIIII 


25 


574 


8.4 


1259 


2 


A43425 


Bravo/Nr-CAM cell 


Db 


325 DQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTRVSPTGTLTIEEVR 384 


26 


567,5 


8.3 


1260 


1 


S05479 


neural cell adhesi 






27 


562.5 


8.2 


1036 


2 


S22383 


axonin 1 precursor 


Qy 


384 QVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQRRKSRMGRQRQRNVQSIIKYLISAV 443 


28 


561 


8.2 


1268 


1 


A39640 


neural cell adhesi 




iimiiiiiiiiiiimiiim i ii ii 


29 


557 


8.1 


1259 


2 


S36126 


neural cell adhesi 


Db 


385 QVDEGAYVCAGMNSAGSSLSKAALKVT--TK AV 415 



30 


554 


8.1 


1447 




tumor suppressor p 


31 


553 


8.1 


1277 


t T3U532 


neural cell adhesi 


32 


552.5 


8.1 


1526 


2 T13823 


frazzled gene prot 


33 


550 


8.0' 


1239 


. A3zb/y 


neuroglian - fruit 


34 


547 


8..0 


1256 


2 T03096 


CDO protein - rat 


35 


531.5 


7.7 


1197 


T30581 


neural cell adhesi 


36 


527 


7.7 


1907 


2 S50893 


protein -tyros ine-p 


37 


526 


7.7 


1897 


TDHULR 


leukocyte antigen- 


38 


518.5 


7.6 


1898 


2 S46216 


leukocyte antigen- 


39 


509.5 


7.4 


1912 


2 A56178 


protein -tyros ine-p 


40 


508.5 


7.4 


7962 


2 138346 


elastic titin ■ hu 


41 


501.5 


7.3 


5175 


2 T20992 


hypothetical prote 


42 


501.5 


7.3 


5198 


2 T43290 


hemicentin precurs 


43 


501 


7.3 


1894 


2 C54689 


protein -tyros ine-p 


44 


498.5 ' 


' 7.3 


1232 


2 T43027 


neural cell adhesi 


45 


492 


7.2, 


1863 


2 S46217 


protein -tyros ine-p 



RESULT 1 
T42405 

sax- 3 protein • Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 03-Dec-1999 fsequence_revision Q3-Dec-1999 ftext.change 21-M-2000 
C; Accession: T42405 

R;Zallen, J. A.; Yi, B.A.; Bargmann, C.I. 
Cell 92, 217-227, 1998 

A;Title: The conserved immunoglobulin superfamily member SAX-3/Robo directs multiple 
A; Reference number: Z22160; MUID: 98117250 
A; Accession: T42405 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA ■ 
A; Residues: 1-1273 <ZAL> 

A; Cross -references: EMBL:AF041053; NID:g2804779; PIDN; AAC38848 .1; PID:g2804780 
C; Genetics: 
A; Note: sax-3 
C; Function: 

A; Description: sax-3 function is required at the time of axon guidance 
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444 TGNTPARPPPTIEHGHQNQTIWGSSAILPCQASGRPTPGISWLRDGLPIDITDSRISQH 503 
416 TGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQB 475 
504 STGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVBMPDPSNFPSSPT 563 
476 STGSLHIADLKKPDTGVYTCIARNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPT 535 
564 QPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGL 623 
536 QPIIVNVTDTEVELHWMAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGL 595 
624 KPSHSYMFVIRAENEKGIGTPSVSSALVTTSRPAAQVALSDKNKMDMAIAEKRLTSEQLI 683 
596 KPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLI 655 
684 KLEEVKTINSTAVRLFWKKRKLEELIDGYYIKWRGPPRTNDNQYVNVTSPSTENYWSNL 743 
656 KLEEVKTINSTAVRLFWKRRKLEELIDGYYIKWRGPPRTNDNQYVNVTSPSTENYVVSNL 715 
744 MPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWKA 803 
716 MPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWRA 775 
804 PKADGINGILKGFQIVIVGQAPNNHRNITTNERAASVTLFHLVTGMTYKIRVAARSNGGV 863 
776 PKADGINGILKGFQIVIVGQAPNNNRNITTNERAASVTLFHLVTGMTYRIRVAARSNGGV 835 
864 GVSHGTSEVIMNQDTLEKHLAAQQENESFLYGLINKSHVPVIVIVAILI IFWI 1 1 AYCY 923 
836 GVSHGTSEVIMNQDTLERHLAAQQENESFLYGLINKSHVPVIVIVAILIIFWIIIAYCY 895 
924 WRHSRNSKSRDRSFIKINDGSmSNNLWDVAQNPNQNPMYNTAGMMNNRNGQALYS 983 
896 WRNSRNSDGKDRSFIKINDGSVHMASNNLWDVAQNPNQNPMYNTAGRMTMKNRNGQALYS 955 
984 LTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTGGPGNAMSTFYGNQYHDDPSPYATTT 1043 
956 LTPNAQDFFNNCDDYSGTMHf PGSEHHYHYAQITGGPGNAMSTFYGNQYHDDPSPYATTT 1015 
1044 LVLSNQQPAWLNDKMLRAPAh PTNPVPPEPPARYADHTAGRRSRSSRASDGRGTLNGGLH 1103 
1016 LVLSNOQPAWLNDKMLRAPAhlPTNPVPPEPPARYADHIAGRRSRSSRASDGRGTLNGGLH 1075 
1104 HRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSRERTGERRTPPNRTLMDFIPPPPSNPPPP 1163 
1076 HRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSKERTGERRTPPNKTLMDFIPPPPSNPPPP 1135 

1164 GGHVYD TATRRQLNRGSTPREDTYDSVSDGAFARVDVNARPTSRNRNIGGRPLRGR 1219 

1136 GGHVYDDIFQTATRRQLNRGSTPREDTYDSVSDGAFARVDVNARPTSRNRNLGGRPLRGR 1195 
1220 RDDDSQRSSLMMDDDGGSSEADGENSEGDVPRGGVRKAVPRMGISASTIAHSCYGTNGTA 1279 
1196 RDDDSQRSSLMMDDDGGSSEADGENSEGDVPRGGYKKAVPRMGISASTLAHSCYGTNGTA 1255 
1280 QRFRSIPRNNGIVTQEQT 1297 
1256 QRFRSIPRNNGIVTQEQT 1273 



RESULT 2 
T29548 

hypothetical protein ZR377.2 ■ Caenorhabditls elegans 
C; Species: Caenorhabditls elegans 

CDate: 15-Oct-1999 tsequencejrevlsion 15-Oct-1999 *text_change 18-Feb-2Q0Q 
C; Access ion: T29548 
R;Nhan, M,; Hawkins, J. 

submitted to the EMBL Data Library, February 1997 
A; Description: The sequence of C. elegans cosroid ZR377. 
A; Reference number: Z20639 
A; Accession; T29548 



t 

Qy 
Db 
Qy 



A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA, 
A;Residues: 1-874 <NHA> 

AjCross-references: EMBL:U88183; PIDN: AAB52657 .1; GSPDB:GN00028; CESP:ZR377.2 

A; Experimental source: strain Bristol N2; clone ZR377 

C; Genetics: 

A; Gene: CESP:ZK377.2 

A; Map position: X 

A;Introns: 91/2; 356/1; 452/1; 701/3; 746/3; 850/1 



Query Match 67.5%; Score 4628; DB 2; Length 874; 

Best Local Similarity 100.0%; Pred. No. 4.5e-245; 

Matches 874; Conservative 0; Mismatches 0; Indels 0; Gaps 



Qy 



Qy 



424 MGRQKQRNVQSIIRYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGRPTPG 483 

1 MGKQKQRNVQSIIRYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPG 60 

484 ISWLRDGLPIDITDSRISQHSTGSLHIADLRRPDTGVYTCIARNEDGESTWSASLTVEDH 543 

61 ISWLRDGLPIDITDSRISQHSTGSLHIADLRKPDTGVYTCIARNEDGESTWSASLTVEDH 120 

544 TSNAQFVRMPpPSNFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLG 603 

121 T SNAQFVRMPDPSNF PSS PTQPII VNVTDTEVELHWNAPST SGAGP ITG Y I IQY YSPDLG 180 

604 QTWFNIPDYVASTEYRIRGLRPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALS 663 

Db 181 QTWFNIPDYVASTEYRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALS 240 

Qy 664 DKNKMDMAIAEKRLTSEQLIKLEEVKTINSTAVRLFWKKRKLEELIDGYYIKWRGPPRTN 723 

Db 241 DKNRMDMAIAERRLTSEQLIRLEEVRTINSTAVRLFWRRRKLEELIDGYYIRWRGPPRTN 300 

Qy 724 DNQYVNVTSPSTENYVVSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLP 783 

Db 301 DNQYVNVTSPSTENYWSNLMPFTNYEFFVIPYHSGVHSIHGAPSHSMDVLTAEAPPSLP 360 

Qy 784 PEDVRIRMLNLTTLRISWRAPKADGINGILRGFQIVIVGQAPNNNRNITTNERAASVTLF 843 

Db 361 PEDVRIRMLNLTTLRISWRAPKADGINGILKGFQIVIVGQAPNNNRNITTNERAASVTLF 420 

Qy 844 HLVTGMTYKIRVAARSNGGVGVSHGTSEVIMNQDTLEKHLAAQQENESFLYGLINKSHVP 903 

Db 421 HLVTGMTYKIRVAARSNGGVGVSHGTSEVIMNQDTLERHLAAQQENESFLYGLINKSHVP 480 

Qy 904 VIVIVAILIimillAYCYWRNSRNSDGKDRSFIKINDGSVHMASNNLWDVAQNPNQNP 963 

Db 481 VIVIVAILIIFWIIIAYCYWRNSRNSDGKDRSFIRINDGSVHMASNNLWDVAQNPNQNP 540 

964 MYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTGGPGNA 1023 

Db 541 MYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTGGPGNA 600 

Qy 1024 MSTFYGNQYHDDPSPYATTTLVLSNQQPAWLNDRMLRAPAMPTNPVPPEPPARYADHTAG 1083 

Db 601 MSTFYGNQYHDDPSPYATTTLVLSNQQPAWLNDRMLRAPAMPTNPVPPEPPARYADHTAG 660 

Qy 1084 RRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSRERTGERRT 1143 

Db 661 RRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSKERTGERRT 720 

Qy 1144 PPNKTLMDFIPPPPSNPPPPGGHVYDTATRRQLKRGSTPREDTYDSVSDGAFARVDVNAR 1203 

Db 721 PPNRTLMDFIPPPPSNPPPPGGHVYDTATRRQLNRGSTPREDTYDSVSDGAFARVDVNAR 780 

Qy 1204 PTSRNRNLGGRPLRGRRDDDSQRSSLMMDDDGGSSEADGENSEGDVPRGGVRKAVPRMGI 1263 

Db 781 PTSRNRNLGGRPLRGRRDDDSQRSSLMMDDDGGSSEADGENSEGDVPRGGVRRAVPRMGI 840 
Qy 1264 SASTLAHSCYGTNGTAQRFRSIPRNNGIVTQEQT 1297 
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Db 841 SASTLAHSCYGTNGTAQRFRSIPRNNGIVTQEQT 874 



RESULT 3 
T29549 

hypothetical protein ZK377.3 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 fsequence_revision 15-Oct-1999 itext_change 18-Feb-2000 
C;Accession: T29549 
R;Nhan, M,; Hawkins, J. 

submitted to the EMBL Data Library, February 1997 
A;Description; The sequence of C, elegans cosmld ZK377. 
A; Reference number: 220639 
A; Accession: T29549 

A; Status: preliminary; translated fr^iti GB/EMBL/DDBJ 
A;Molecule type: DNA 
A;Residues: 1-423 <NHA> 

Cross-references: EMBL:U88183; PIDN:AAB52658.1; GSPDB:GN00028; CESP:ZK377.3 
xperimental source: strain Bristol N2; clone ZK377 
enetics : 
A;Gene: CESP:ZK377.3 
A; Map position: X I 
A;Introns: 24/1; 142/3; 229/3; 284/2! 408/3 



Query Match 32.5%; Scire 2232; DB 2; Length 423; 

Best Local Similarity 100.04; Pred. No. l.le-114; 

Matches 423; Conservative 0; 'Mismatches 0; Indels C 



Gaps 



1 MYYLGFYHTHTHTHTnNFDKIPNASSLAPVIIEHPIDVWSRGSPAILNCGAKPSTAKI 60 

IIIIIIIIMIIMIIIIIIIIIIMIIIIIIIIMIMIIIIIMIIIMIMIIIIII 

1 MYYLGFYHTHTHTHTnNFDKIPNASKLAPVIIEHPIDWVSRGSPATLNCGAKPSTAKI 60 



Qy 

Db 

Qy 

Db 

Qy 

Db 

I 

Qy 

Db 

Qy 

Db 

Qy 

Db 



61 TWYKDGQPVITNKEQVNSHRIVLDTGSLFLLKVNSGKNGKDSDAGAJYCVASNEHGEVRS 120 

IIIMIIiJIIMIIIIIIIISIIIIimillllMIIIIIIIIMIIIIIIMIIIIII 
61 TWYKDGQPVITNREQVNSHRIVLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKS 120 

I. 

■ 121 NEGSLKLftMLREDFRVRPRTVQALGGEMAVLECSPPRGFPEPWSWRRDDRELRIQDMPR 180 

121 NEGSLKLAMLREDFRVRPRTVQALGGEMAVLECSPPRGFPEPWSWRKDDKELRIQDMPR 180 

.181 YTLHSDGNLIIDPVDRSDSGTYQCVANNMVGERVSNPARLSVFERPKFEQEPKDMTVDVG 240 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
181 YTLHSDGNLIIK'VDRSDSGTYQCVANNMVGERVSNPARLSVFERPKFEQEPKDMTVDVG 240 

241 AAVLFDCRVTGDPQPQITWRRRNEPMPVTRAYIARDNRGLRIERVQPSDEGEYVCYARNP 300 

IMIIIIHIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIilllllll 
241 AAVLFDCBVTGDPQPQITWKRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNP 300 

301 AGTLEASAHLRVQAPPSFQTKpADQSVPAGGTATFECTLVGQPSPAYFWSREGQQDLLFP 360 

lllllll!MIIIIII!llllilllll!lllllllllllll!lll!lll!MII!llllll I 
301 AGTLEASMLRVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSREGQQDLLFP 360 ' 

361 SYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALRATFETKGRVQRK 420 

361 SYVSADGRTRVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALRATFETRGRVQRR 420 



421 KSK 423 

III 

421 RSR 423 



I 



RESULT 

T30805 1 ; 

duttl protein - mouse , 

N; Alternate names: transmembrane receptor protein Robol homolog 
CjSpecies: Mus musculus (house mouse) 

C;Date: 22-Oct-1999 tsequence_revision 22-Oct-1999 #text_change 22-Oct-1999 
C; Accession: T30805 

R;Wu, M.C.; Lowe, N. ; Fordham, R,; Rabbitts, P. 
• submitted to the EMBL Data Library, July 1998 
A; Description: The mouse homologue of human DUTTl/H-robol gene: protein sequence and chi 
A; Reference number: 220879 



Accession; T30805 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1612 <WUM> 

A;Cross-references: EMBL:Y17793; NID:el329712; PID:el329713; PIDN:CAA76850.1 

A; Experimental source: brain 

C; Genetics: 

A; Gene: duttl 

A; Map position: 16 ' 



Query Match . 21.9*; Score 1505.5; DB 2; Length 1612; 

Best Local Similarity 32.1%; Pred. No. 3.3e-74; 

Matches 402; Conservative 176; Mismatches 498; Indels 175; Gaps 

Qy 30 PVIIEHPIDVWSRGSPATLNCGAR-PSTARITWYKDGQPVITNKEQVNSHRIVLDTGSL 88 

I hill Iv'lhl llllll I: I I III h I |:|: llh:| :|ll 
Db 29 PRIVEHPSDLIVSRGEPATLNCRAEGRPTPTIEWYKGGERVETDRDDPRSHRMLLPSGSL 88 

Qy 89 FLLKVNSGRNGKDSDAGAYYCVASNEHGEVRSNEGSLKLAMLREDFRVRPRTVQALGGEM 148 

II:: : : I I I IN I II h ll::|:||:lll I I II 
Db 89 FFLRIVHGRKSR-PDEGVYICVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEP 147 

Qy 149 AVLECSPPRGFPEPVVSWRRDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 

11:11 i 1 1 1 'III :||:|| I :| I |: I hi :lhl I II I 
Db 148 AVMECQPPRGHPEPTISWKRDGSPLDDKD-ERITIRG-GRLMITYTRRSDAGRYVCVGTN 205 

Qy 209 MVGERVSNPARLSVFERPRFEQEPRDMTVDVGAAVLFDCRVTGDPQPQITWRRRNEPMPV 268 

III I H:| |:|| : | :: M : I I III I : |:: : :| 
Db 206 MVGERESEVAELTVLERPSFVRRPSNLAVTVDDSAEFRCEARGDPVPTVRWRRDDGELPR 265 

Qy 269 TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVP 328 
:| I :|: . 1:1 :| I II I I I I Mil llllll II II I 
266 SR-YEIRDDHI1RIRRVTAGDMGSYTCVAENMVGRAEASATLTVQEPPHFWRPRDQWA 324 



329 AGGTATFECTLVGQPSPAYFWSREGQQDLLFPSY-VSADGRTRVSPTGTLTIEEVRQVD 386 

I I 11:1 llllll :ll hill II : I II II III h: I 

325 LGRTVTFQCEATGNPQPAIFWRREGSQNLLF-SYQPPQSSSRFSVSQTGDLTITNVQRSD 383 

387 EGAYVCAGMNSAGSSLSRAALKATFETRGRVQRRRSRMGRQRQRNVQSIIRYLISAVTGN 446 

I hi : I -III ::M h II 

384 VGYYICQTLNVAGS I IT RAYLE VTDV 409 

447 TPARPPPIIEHGHQNQTLMVGSSAILPCQASGRPTPGISWLRDGLPIDITDSRISQHSTG 506 

:MI I :l III: I : III hi I I I I :lh : llll I :| 

410 IADRPPPVIRQGPVNQTVAVDGTLILSCVATGSPAPTILWRKDGVLVSTQDSRIKQLESG 4 69 

507 SLHIADLRRPDTGVYTCIARNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPI 566 

I I I I'M III I hill : h: I | ||: ||:|::| 

470 VLQIRYARLGDTGRYTCTASTPSGEATWSAYIEVQEFGVPYQPPRPTDPNLIPSAPSKPE 529 

567 IVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIRGLRPS 626 

: :h H i III I I llh :| I :| : I : : llllll: 

530 VTDVSKNTVTLSWQPNLNSGATP-TSYIIEAFSHASGSSWQTAAENVRTETFAIRGLKPN 588 

627 HSYMFVIRAENERGIGTPS-VSSALVTTSRPAAQVALSDRNRMDMAIAERRLTSEQLIRL 685 

|:|::|| Ml II :| : I I : I :| :: I 

589 AIYLFLVRAANAYGISDPSQISDPVKTQDVPPTSQGVDHKQ VQRELGNWLHL 641 

686 EEVKT INSTAVRLFWKRRRLEELIDGY Y IRWRG PPRTN ■ DNQYV - • NVTSPSTENY WSN 742 

::'|::[ : I : : I || I :| :: ::::: I :|: : h : 

642 HNPTILSSSSVEVHWTVDQQSQYIQGYKILYRPSGASHGESEWLVFEVRTPTRNSWIPD 701 

743 LMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRML-NLTTLRIS 800 

I III ' h : II I II II II I : I I : :: 

702 LRRGVNYEIRARPF— FNEFQGADSEIRFARTLEEAPSAPPRSVTVSRNDGNGTAILVT 758 

801 WKAPRADGINGILKGFQIVIVGQAPNNNRMTTNERAASVTLFHLVTGMTYRIRVAARSN 860 

I: I I II::: ::: :| : I I : II : II h I : III : 

759 WQPPPEDTQN3MVQEYRVWCLGNETRYHINRTVDGSTFSWIPSLVPGIRYSVEVAASTG 818 



861 GGVGV" 

I II 



■ -SHGTSEVIMNQDTLERHLAAQQENESFLYGLINKSHVPVIVIVAI 910 

III :| :| : :: :|: h : I 
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Db 819 AGPGVKSEPQFIQLDSHGNPVSPEDQVSLAQQISDWRQPAFIAGIGAACWI I 871 

Qy 911 LIIFWIIIAYCYWRNSRNSD GKDRSF IRINDGSVHMASN— NLWDVAQN 958 

: : ; II I I II : I ::| I :::: 
Db 872 LMVFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGGRPGLLNISE- 930 

Qy 959 PNQNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTG 1018 

l ; I Mil:: : : II II : II 

Db 931 PATQPWtADTWPNTGNNHNDCSINCCTAGNGNSDSNLTTYS — RPADCIANYNNQLDN 986 

Qy 1019 GPGNAM — STFYG NQYHD DPSPTATTTLV— LSN 1048 

II II II I: MUM |: III 

Db 987 KOTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQANLSN 1046 

Qy 1049 QQPAWLNDKMLRAPAMPTNPVPPEPPARYADHTAGRRSRSSRASDGRGTL NGGL 1102 

I : I I I :| : Ihl h I 
Db 1047 NMNNGAGDSSERHWKPPGQQKPEVAPIQYNIMEQNKLNKDYRAND- • -TIPPTIPYNQSY 1103 

Qy 1103 HHRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSKERTGERRTP- -PNKTLM- --DFIPPPP 1157 

.1111 Mil M III IM I MM 

Db 1104 DQNTGGSYNSSD RGSSTSGSQGHKRG-ARTPKAPKQGGMNWADLLPPPP 1151 

H 1158 SNPPP PGGHVY- --DTATRRQLNRGSTP 1182 

W ::IM I :| I : II II 

Db 1152 IAHPPPHSNSEEYNMSVDESYDQEMPCPVPPAPMYLQQDELQEEEDERGPTP 1202 



RESOLT 5 
T14160 ; 

transmembrane receptor protein Robol - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 20-Sep-1999 fsequence_revision 20*Sep-1999 #text_change 20-Sep-1999 
C; Accession: T14160 

R;Ridd, T.; Brose, K. ; Mitchell, K.J.; Fetter, R.D.; Tessier-Lavigne, M.; Goodman, C.S 
Cell 92, 205-215, 1998 

A;Title: Roundabout controls axon crossing of the CNS midline and defines a novel subfan 
A;Reference number: Z17897; MUID: 98117249 
A; Accession: T14160 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1651 <KID> 

A;Cross-references: EMBL:AF041082; NID:g2811215; PID:g2811216; PIDN:AAC39960.1 
C; Function: 

A;Description: appears to function as the gatekeeper controlling midline crossing 
C; Keywords: transmembrane protein 



Query Match 21.6%; Score 1483.5; DB 2; Length 1651; 

Bestf Local Similarity 31.4%; Pred. No. 5.4e-73; 
tches 395; Conservative 176; Mismatches 497; Indels 191; Gaps 33; 

30 PVIIEHPIDVWSRGSPATLNCGAR-PSTAKITWYKDGQPVITNKEQVNSHRIVLDTGSL 88 

I hill MIM Mill |: | | III |: | |:|: MM Ml 
68 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 127 



Qy 89 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLRLAMLREDFRVRPRTVQALGGEM 148 

I I:: I: : I I I III I II h IMMMI I I II 
Db 128 FFLRIVHGRKSR-PDEGVYICVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEP 186 

Qy 149 AVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 

MM 1 1 1 1 Ml MIM MIM I hi MIM I II I 
Db 187 AVMECQPPRGHPEPTISWKKDGSPLDDKD-ERITIRG-GKLMITYTRKSDAGKYVCVGTN 244 

Qy 209 MVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMPV 268 

Hill I MM IM M MM I : II III I I:: : M 
Db 245 MVGERESKVADVTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTFGWRKDDGELPK 304 

Qy 269 TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVP 328 

M I :|: IM :| I I I I II I INI I II II I II II I 
Db 305 SR-YEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPRDQWA 363 

Qy 329 AGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSY—VSADGRTKVSPTGTLTIEEVRQVD 386 

I I Ihl I I II II Ml hill II : I II II lh I:: I 



Db 364 LGRTVTFQCEATGNPQPAIFWRREGSQNLLF-SYQPPQSSSRFSVSQTGDLTVTNVQRSD 422 

Qy 387 EGAYVCAGMNSAGSSLSKAALKATFETKGRVQKKKSKMGKQKQKNVQSIIKYLISAVTGN 446 

I hi MM M h II 

Db 423 VGYY ICQTLNVAGS I ITKAYLE VTDV 448 

Qy 447 TPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQHSTG 506 

Ml I M III: I : I I hi I I I I Ml: : Ml I M 
Db 449 IADRPPPVIRQGPVNQTVAVDGTLTLSCVATGSPVPIILWRKDGVLVSTQDSRIKQLESG 508 

Qy 507 SLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPI 566 

II Mil III I Ihllll : h: II lh IM::i 

Db 509 VLQIRYAKLGDTGRYTCTASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPE 568 

Qy 567 IVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIRGLKPS 626 

: :h I II III I I III: M I :| : : I : : 1 1 11 1 1 : 
Db 569 VTDVSRNTVTLLWQPNLNSGATP-TSYIIEAFSHASGSSWQTVAENVKTETFAIKGLRPN 627 

Qy 627 HSYMFVIRAENEKGIGTPS-VSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKL 6c 3 

I:|::|| I II II M M I : I M :: I 

Db 628 AIYLFLVRAANAYGISDPSQISDPVRTQDVPPTTQGVDHRQ VQRELGNWLHL 680 

Qy 686 EEVKTINSTAVRLMKRKLEELIDGYYIKKRGPPRTN-DNQYV--NVTSPSTENYWSN 742 

Db 681 HNPTILSSSSVEVHOTVDQQSQ 740 

Qy 743 LMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRML-NLTTLRIS 800 

I MM : II I I I II II I : II::: 
Db 741 LRKGVNYEIKARPF* - -FNEFQGADSEIKFAKTLEERPSAPPRSVTVSRNDGNGTAILVT 797 

Qy 801 WKAPKADGINGILKGFQIVIVGQAPNNNRNITTNERAASVTLFHLVTGMTYKIRVAARSN 860 

h I I II::: ::: :| MM II : II h I : III : 
Db 798 WQPPPEDTQNGMVQEYRVWCLGNETRYHINRTVDGSTFSWIPFLVPGIRYSVEVAASTG 857 

Qy 861 GGVGV SHGTSEVIMNQDTLEKHLAAQQENESFLYGLINKSHVPVIVIVAI 910 

I II ! III :| :| : :: : :|: I : I 

Db 858 AGPGVKSEPQFIQLDSHGNPVSPEDQVSLAQQISDWKQPAFIAG IGAAC 907 

Qy 911 LIIFWIIIAYCYWRNSRN SDGKDRSFIRI 940 

II :| I ' I II MM 

Db 908 WIILMVFSIWIiYRHRKRRNGLSSTYAGIRRVPSFTFTPTVTYQRGGEAVSSGGRPGLLNI 967 

Qy 941 NDGSVHMASNNLWDVAQNPNQNPMYNTAGRMTMNNRNGQALYSLTPNAQ DFFNN 994 

:: : MUM || : :|| :: :: I 

Db 968 SEPATQPWLADTW PNTGNSHNDCSINCCTASNGNSDSNLTTYSRPADCIANYNNQ 1022 

Qy 995 CDDYSGTMHRPGSEHHYHYAQLTGGPGNAMSTFYGNQYHD DPSPYATTTLVL 1046 

I: : I I I I: I I II I MM |: 

Db 1023 LDNRQINLMLPEST-VYGDVDLS-NRINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQ 1080 

Qy 1047 SNQQPAWLN DKMLRAPAMPTNPVPPEPPARYADHTAGRRSRSSRASDGRGTL-- 1098 

:| I ' :| : I I Ml : :: Ihl h 
Db 1081 ANLINNMNNGGGDSSERHWKPPGQQKQEV--APIQYNIMEQNKLNKDYRAND--TILP 1134 

Qy 1099 —-NGGLHHRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSRERTGERRTP--PNRTLM- 1150 

I 'I III I ill M III M I 

Db 1135 TIPYNHSYDQNTGGSYNSSD RGSSISGSQGHKKG-ARTPKAPKQGGMNW 1182 

Qy 1151 -DFIPPPPSNPPP PGGHVY— DTATRRQLNRGSTP 1182 

I MllhMII I :| I : II II 

Db 1183 ADLLPPPPAHPPPHSNSEEYSMSVDESYDQEMPCPVPPARMYLQQDELEEEEAERGPTP 1241 



RESOLT 6 
T14316 

rig-1 protein - mouse 

C;Species: Mus musculus (house mouse) 

C;Date: 20-Sep-1999 t'sequence_revision 20-Sep-1999 »text_change 20-Sep-1999 
C; Accession: T14316 

R;Yuan, S.S.F.; Cox, ,; L.A.; Dasika, G.R.; Lee, E.Y.H.P. 
submitted to the EMBL Data Library, April 1998 
A; Reference number: Z17975 
A; Accession: T14316 ! 



Best Available Copy 

Mon Jan 22 13:04:34 2001 



us-09-540-245a-17.rpr 



Page 5 



A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-1344 <YOA> 
AjCross-references: EMBL:AF060570; MD:g4 



Query Match 19.3%; Score 

Best Local Similarity 29.2%; Pred. 
Matches 391; Conservative 



!; DB 2; Length 1344; 
2.3e-64; 

Mismatches 539; Indels 220; Gaps 42; 



16 YINFDKIPNA — SNLAPVIIEHPIDVWS 1GSPATLNCGA — KPSTAKITWYKDGQ 67 
. I: I : : I hi I hill I Mil II :|: I III: 
24 YLELPSSPGSRVGPEDAMPRIVEQPPDLWSRGEPATLPCRAEGRPRPN— IEWYKNGA 80 
<* 

PVITNREQVNSHRIVLDTGSLFLLKVNSGRN IRDSDAGAYYCVASNEHGEVRSNEGSLKL 127 



Qy 

Db 

Qy 
Db 
Qy 
Db 
Qy 

Db 
Qy 
Db 



I I :l : 1 1 : : I :hll :: I 
81 RVATAREDPRAHRLLLPSGALFFPRIVHGRR !R- 



106385; PID:g4206386; PIDN:AAD11628.1 



II Mill I I lh: 
1 PDEGVYTCVARNYLGAAASRNASLEV 139 



.28 AMLREDFRVRPRTVQALGGEMAVLECSPPRG 'PEPWSWRRDDRELRIQDMPRYTLHSDG 187 
hlhlll I I II Ihll Ihl 1 1 1 : 1 : 1 : 1 :|: :: I h I 

140 AVLRDDFRQSPGNVWAVGEPAVMECVPPKGfPEPLVTWKKGKIKLK-EEEGRITIRG-G 197 

) NLIIDPVDRSDSGTYQCVANNMVGERVSNPARLSVFEKPKFEQEPRDMTVDVGAAVLFDC 247 

I:: :lhl I llhll 1 1 1 I I I I h I I : I : I I I I I 
! KLMMSHTFKSDAGMYMCVASNMAGERESGAAELVVLERPSFLRRPINQVVLADAPVNFLC 257 

148 RVTGDPQPQITWKRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEAS 307 

I Mill : h: : :| II : : I h:| III II I I I III 
:58 EVQGDPQPNLHWRKDDGELPAGR-YEIRSDHSLWIDQVSSEDEGTYTCVAENSVGRAEAS 316 

AHLRVQAPPSFQTRPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSY-VSAD 366 
I I II I III Ml I :hl I I II II III I lllll : 
11 GSLSVHVPPQFVTKPQNQTVAPGANVSFQCETKGNPPPAIFWQKEGSQVLLFPSQSLQPM 376 

67 GRTKVSPTGTLTIEEVROVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQKRKSKMGK 426 

II III I I I lh I I III :: III 1:11 I I II 
77 GRLLVSPRGQLNITEVKIGDGGYYVCQAVSVAGSILAKALL— -EIKG 421 

27 OKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGRPTPGISW 486 

I III I lll|::||| III: I I I II 
22 ASIDG LPPHLQGPMQTLVLGSSVWLPCRVIGNPQPNIQW 462 

87 LRDGLPIDITDSRISQHSTGSLHIADLKKPDTGV7TCIAKNEDGESTWSASLTVEDHTSN 546 

:| : lh : Mill ::: I I hhlh 1 1 : 1 1 : : I :: 
63 RRDERWLQGDDSQFNLMDNGTLHIASIQEMDMGFYSCVAKSSIGEATWNSWLRRQEDW-G 521 

147 AQFVRMPDPSNFPSSPTQPHVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTW 606 

III I Mlh II M I III I hi: :| I II 
122 ASPGPATGPSNPPGPPSQPIVTEVTANSITLTWRPNPQSGA-TATSYVIEAFSQAAGNTW 580 

607 FNIPDYVASTEYRIRGLRPSHSYMFVIRAENERGIGTPSVSSALVTTSRPAAQVALSDKN 666 

Ml II Ihl: hh:|| h II I I I : I 
181 RTVADGVQLETYTISGLQPNTIYLFLVRAVGAWGLSEPSPVSEPVQTQDSSLSRPAEDPW 640 

RMDMAIAEKRLTSEQLIKLEEVKTINSTAVRLFWRRRKLEELIDGYYIRWRGPPRTNDN- 725 
I :| I ::::| : ::: I :|: h Ml : 

141 RGQRGLA EVAVRMQEPTVLGPRTLQVSWTVDGPVQLVQGFRVSWRIAGLDQGSW 694 

'26 QYVNVTSPSTENYWSNLMPFINYEFFV-IPYHSGVHSIHGAPSNSMDVLTAEAPPSLPP 784 

::MI :M: I I : I : h III: I II II 
695 TMLDLQSPHRQSTVLRGLPPGAQIQIRVQVQGQEGL GAESPFVTRSIPEEAPSGPP 750 

'85 EDVRIim--LNLTTLRISWKfllPKADGINGiLKGFQIVIVGQAPNNNRNITTNERMSVTL 842 

M : : ::: MM lh: :|| :| M : I III 
51 QGVAVALGGDRNSSVTVSWEPPLPSQRNGVITEYQIKCLGNESRFHLNRSAAGWARSVTF 810 

843 FHLVTGMTYKIRVAARSNGGVGVSHGTSEVIMN QDTLEKHLAAQQEN 889 

h I h III :: Mlh :: h: M : II 

811 SGLLPGQIYRALVAAATSAGVGVA--SAPVLVQLPFPPAAEPGPEVSEGLAERLAKVLRK 868 

890 ESFLyGLINKSHVPVIVIVAILIIFWIIIAYCYWRNSRNSDGKDRSFIRINDGSVHMAS 949 

:ll I ::: :l I IM Ml 
869 PAFLAGSSAACG ALLLGFCAALYRRQKQRRELSH YTAS 906 



Qy 950 NNLWDVAQNPNQNPMYNTAGRMTMNNRNGQALY SLTPNAQDFFNNCDDYS 999 

I: : :: I I III :|:||: :| 

Db 907 FAYTPAVSFPHSEGLSGSSSRPPMG--LGPAAYPWLADSWPHPPRSPSAQEPRGSC-— 960 

Qy 1000 GTMHRPGSEHHYHY AQLTG'GP GNAMSTFYG — NQYH 1033 

I : ,h I :l II I : Ihl 

Db 961 -CPSNPDPDDRYYNEAGISLYLAQTARGANASGEGPVYSTIDPVGEELQTFHGGFPQHSS 1019 

Qy 1034 DDPSPYATTTLVLSNQQPAWL NDKMLRAPA-MPT- • - -NPVPPEPPARYAD 1079 

III :: II hi I lh M lh 

Db 1020 GDPSTWS— ---QYAPPEWSEGDSGARGGQGRLLGRPVQMPSLSWPEALPPPPPSCELS 1073 

Qy 1080 HTAGRRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSRERTG 1139 

I :ll II : h I I :h 

Db 1074 CPEGPEEELKGSSD LEEWCPPVPEKSH- -LVGSSSSGACMVAPA 1115 

Qy 1140 ERRTPP -NKTLMDFIPPPPSNPPPPGG- -HVYDTATRRQLNRG STPREDT 1186 

III :: Mill'!::!! | | : 

Db 1116 PRDTPSPTSSYGQQSTATLTPSPPDPPQPPTDIPHLHQMPRRVPLGPSSPLSVSQPALSS 1175 

Qy 1187 YDS - -VSDGAFARVDVNARPT SRNRNLGGR- - -PLKGRRDDDSQRSSLMM 1231 

: II : :M: I I : I III! :: : 

Db 1176 HDGRPVGLGAGPVLSYHASPSPVPSTASSAPGRTRQVTGEMTPPLHGHRARIRRRPRAL- 1234 

Qy 1232 DDDGGSSEADGENSEGDVP 1250 
hi Ihl 

Db 1235 PYRREHSPGDLP 1246 



RESULT 7 
158164 

BIG-1 protein - rat 

C; Species: Rattus norvegicus (Norway rat) 

C;Date: 26-M-1996 #sequence_revision 26-M-1996 itext.change 24-Sep-1999 
C;Accession: 158164 ■ 

R;Yoshihara, Y,; Kawasaki, M,; Tani, A.; Tamada, A.; Nagata, S.; Ragamiyama, H.; Mori 
Neuron 13, 415-426, -1994 

A;Title: BIG-1: a new'TAG-l/F3-related member of the immunoglobulin superfamily with 
A;Reference number: 158164; MUID : 94338697 
A; Accession: 158164 ■ 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-1028 <RES> 

A;Cross-references: EMBL:O11031; NID:g563132; PIDN:AAA63607.1; PID:g563133 
C; Genetics: 
A; Gene: BIG-1 

C; Superfamily: contactin; fibronectin type III repeat homology; immunoglobulin har.clo 



Query Match 9.3%; Score 635; DB 2; Length 1028; 

Best Local Similarity 24.8%; Pred. No. 5. Be- 27; 

Matches 247; Conservative 130; Mismatches 404; Indels 216; Gaps 

< 

Qy 30 PVIIEHPIDVWSRGS— PATLNCGAKPS-TARITWYRDGQPVITNREQVNSHRIVLDT 85 

II :: I :■: II III! h : : I :| : I: : III: 
Db 26 PVFVREPSNSIFPVGSEDRKITLNCEARGNPSPHYRWQLNGSDIDTSLD- - - -HRYKLNG 81 

Qy 86 GSLFLLRVNSGRNGRDSDAGAYYCVASNEHGEVRSNEGSLKLAMLREDFRVRPRT-VQAL 144 

M :: I : : M I : I I h I I M I I M I hh I h I 
Db 82 GNLIVINPN RNWDTGS YQCFATNSLGT IVSREARLQFAYL - ENFKSRMRSRVSVR 135 

Qy 145 GGEMAVLECSPPRGFPEPWSWRKDDRELRI-QDMPRYTLHSDGNLIIDPVDRSDSGTYQ 203 

h II I 'II I :| :: : :| h hi I h II II 
Db 136 EGQGWLLCGPPPHSGELSYAWVFNEYPSFVEEDSRRFVSQETGHLYIARVEPSDVGNYT 195 

Qy 204 CVANNMV- -GSRVSNPARLSVFE RPRFE - QEPRDMT VDVGAAVLFDCRVTGDP 253 

II M ; : M I : M llh: h I :| hi 

Db 196 CWTSTVTNASVLGSPTPLVLRSDGVMGEYEPKIELQFPETLPAARGSTVRLECFALGNP 255 

Qy 254 QPQITWRRRN-EPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRV 312 

III hi : I I I: M I I I I I I II I I I I I 
Db 256 VPQINWRRSDGMPFP-TKIRLRRFNGVLEIPNFQQEDTGSYECIAENSRGKNVARGRLTY 314 
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Qy 313 QAPPSFQTKPADQSVPAGGTMFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVS 372 

II: I : :ll 1:1 hi I I I :| : I :: 
Db 315 YAKPYWVQLLKDVETAVEDSLYWECRASGKPKPSYRWLKNGDALVL EERIQIE 367 



Qy 373 PTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQKKKSKMGKQKQKNV 432 

MM : I I : I I I I I II : II : 
Db 368 -NGALTIANLNVSDSGMFQCIAENKHGLIYSSAELKVLASAPDFSRNPMKKM I 419 

Qy 433 QSIIKYLISAVTGNTPARPPPTIEHGHQNQTIiMVGSSAILPCQASGRPTPGISWLRDGLP 492 

I : III II I: I I :h : I 

Db 420 Q VQVGSLVILDCKPSASPR-ALSFWKKGDT 448 

Qy ,493 IDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHT SN 546 

: :IH : I I I :: I I |:|||||:|: |:: : I I : I II 
Db 449 WREQARISLLNS5GLKIMNVTKADAGIYTCIAENQFGKMGTTQLWTEPTRIILAPSN 508 

Qy 547 AQFVRMP DPSNF 558 

: : :| I hi 

Db 509 MDVAVGESIILPCQVQHDPLLDIMFAWYFNGTLTDFKKDGSHFEKVGGSSSGDLMIRNIQ 568 

B 559 PSSPTQPIIVNVTDTEVELHWNAPSTSGAGP 589 
U : :||| :| I I I 
569 LKHSGKYVCMVQTGVDSVSSAAELIVRGSPGPPENVKVDEITDTTAQLSW-TEGTDSHSP 627 

Qy 590 ITGYIIQYYSPDLGQTWFN---IPDYV-ASTEYRIKGLKPSHSYMFVIRAENEKGIGIP 644 

: I :| :| I I :|: : : : I I | | : | |: | | | 
Db 628 VISYAVQARTP-FSVGWQNVRTVPEAIDGKTRTATWELNPWVEYEFRWASNKIGGGEP 686 

Qy 645 SVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEEVKTINSTAVRLFWKKRK 704 

I : I I I : I : I ! I : : || :| : 
Db 687 SLPSEKVRTEEAAPEVAPSEVS GGGGSRSELVITWDPVP 725 

Qy 705 LEELID — GYYIKWRGPPRTNDNQYVNVBSPSTENYWSN- -LMPFTNYEFFVIPYHS 758 

III : II : :l III I III I II I ::||: II I h 
Db 726 •EELQNGGGFGYWAFRPLGVTTWIQTV-VTISPDNPRYVFRNESIVPFSPYEVKVGVYN- 782 

Qy 759 GVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWKAPKADGINGILKGFQI 818 

: I I I :|| I:: I : h : : :|| II I |::: 

Db 783 "NKGEGPFSPVTTVFSAEEEPTVAPSHISAHSLSSSEIEVSWNTIPWKSSNGRLLGYEV 840 

Qy 819 VI - - VGQAPNNNRNITTNERAASVTLFHLVTGMTYKIRVAARSNGGVGVSHGT 869 

I : I I I : : I I I : |j| I 

Db 841 RYWNNGGEEESSSKVKVAGNQ1SAVLRGLKSMYTAMYNTAGAGPFSATVNATTKK 900 

Qy 870 SEVIMNQDTLEKHLAAQQ — EN! 

I: I : I :| II 
Db 901 TPPSQPPGNWWNATDTKVLLNWEQVRALEN1 SEVTG 



937 



8 



•SOLT 
0600 

neogenin • chicken (fragment) 
C; Species: Gallus gallus (chicken) 

C;Date: 13-Sep-1996 fsequencejrevision 13-Sep-1996 itext_change 21-Jul-2000 
C;Accession: 150600 

R;Vielmetter, J.; Rayyem, J.F.; Roman, j.m,; Dreyer, W.J, 
J. Cell Biol. 127, 2009-2020, 1994 
A; Title: Neogenin, an avian cell surface protein expressed during terminal neuronal diff 
A; Reference number: A55193; MUID: 95105243 
Accession: 150600 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1443 <VIE> 

A; Cross -references: EMBL:U07644; NID:g641965; PIDN: AAC59662 .1; PID:g641966 



Query Match 9.24; Score 632.5; DB 2; Length 1443; 

Best Local Similarity 21. H; Pred. No. 1.3e-26; 

Matches 323; Conservative 199; Mismatches 553; mdels 453; Gaps 64; 

Qy 36 PIDWVSRGSPATLNCGAKPST-AKITWYKDGQPVITNKEQVNSHRIVLDTGSLFLLKVN 94 

hi:: lh :ll : I II I III : I : I :| III : I 



Db 25 PMDILSVRGASVIMNCSSYCETPPKIEWKKDG--TLLNLVS-DDRRQLLPDGSLLINSW 81 

Qy 95 SGRNGRDSDAGAYYCVASNEH - GEVRSNEGSLKLAMLREDFRVRPRTVQALGGEMAVLEC 153 

1:1 III III: I I : I I Ml I :| I |:| I 

Db 82 HSKHNK-PDEGYYQCVATVESLGSIVSRTAKLTVAGLPR-FTSQPELSSVYKGNSAILNC 139 

Qy 154 SPPRGFPEPVVSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANNMVGER 213 

I I I :l : I : I I I hi :| I hll : : 

Db 140 EVNVDL-APFVRWEQDRQPLSLDD--RVFKLPSGALLIGNATDTDGGFYRCVIESGGTPK 196 



Qy 214 VSNPARLSVFEKPK FEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWRRKNEPMP 267 

Db 197 YSEEMLKILPDPEEPQSLVFVRQPSSLTKVTGQNA^ ' ' ' g' ' 256 

Qy 268 V-TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQ 325 

: : :; III I I I I I I hll I I II II I :lh 

Db 257 TEDSERFALRAGGSLLISDVTEEDVGTYTCIADNENETIEAQAELAVQVPPEFLKRPANI 316 

Qy 326 SVPAGGTATFECTLVGQPSPAYFWSREGQQDLLFPSYVSADGRTRVSPTGTLTIEEVRQV 385 

III : I : I : I I I I h: II h I : : : 

Db 317 YAHESMDIVFECEVTGRPTPTVRWVRNG- -DWIPS DYFKIVREHNLQVLGLVKS 369 

Qy 386 DEGAYVCAGMNSAGSSLSKAAL KAT 410 

Ml I I I h: : I I II 

Db 370 DEGFYQCIAENDVGNAQAGAQLIILDLDVAIPTLPPTSLTSATNDHLAPATTGPLPTAPR 429 

Qy 411 FETKGRVQKKR — SKMGRQRQKNVQS 434 

III:::: h I : I :|: 

Db 430 DWATLVSTRFIRLTWRTPVSDPQGDNLTYSIFYTREGINRERVENTSRPG-ETQVMIQN 488 

Qy 435 IIR— YLISAVTGNTPARPPPTIEHGHQNQTLMVGSSAILPCQASGRPTPGISWLRDGL 491 

:: I: I I :||| : :• : I I I I I I 

Db 489 LMPETVYVFRWAQN RHGHGESSAPLKVATQPEVQLPG-PAPNIR-AYAGS 537 

Qy 492 PIDIT— DSRIS QHSTGSLHIADLKRPDTGV YTCIA 525 

I :| :: :l I I : :| I lh : :| 

Db 538 PISVTVTWETPLSGNGEIQNYRLYYMERGQDSEQDVDVAGLSYTITGLKKYTEYSFRWA 597 

Qy 526 KNEDGESTWSASLTVEDHISNAQFVRMPDPSNFPSSPTQPIIVNVTDTE-VELHWN-APS 583 

hi : I | : I: ||: | : : ::; : ||| |: 

Db 598 YNRHGPGV-- 1 STQDVWRTLSDVPSAAPQNLTLEARNSRSIMLHWQPPPA 645 

Qy 584 TSGAGPITGYIIQY 597 

: :l llll-hl 

Db 646 GTHSGQITGYKIRYRKVSRRSDVTESVGGTQLFQLIEGLERGTEYNFRIAAMTVNGTGPA 705 

Qy 598 YSPDLGQTWFHIPDYVASTE 617 

: II :: :|: :| 

Db 706 TDWVSAETFESDLDES - -RVPEVPSSLHVRPLVT S I WSWTPPENQNI WRG YAIG YG IG 763 

Qy 618 YRIRGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSRPAAQVAL 662 

'< I h I II h ::M III lh ::| 

Db 764 SPHAQTIRVDYKQRYYT1ENLDPSSHYVITLRAFNNVGEGIPLYESAV- - -TRPH 815 

Qy 663 SDKNKMDMAIAERRLT SEQLIRL • EEYKT INSTAVRLFWRKRKL - - - EELIDG - - 711 

II :::|: :•■ I I : : : :: :|: I I ::: I 

Db 816 SDTSEVDLFVINAPYTPVPDPSPMMPPVGVQASILSHDTIRITWADNSLPRNQKITDARY 875 

Qy 712 YYIKWRGPPRTN--DNQYVNVTSPSTENYWSNLMPFTNYEFFVI---PYHSGVHSI-H 764 

I ::| I : :| :|:|: I I I III h I hi 

Db 876 YTVRW — KTNIPANTKYKTANATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAH 931 

Qy 765 GAPSNSMDVLTAEAPPSLPPEDVRI - -RMLNLTTLRISWRAPRADGINGILKGFQIVIVG 822 

I i| I h Ihll : : h ::h I II : h h 

Db 932 GT TFELVPTSPPKDVTWSKEGKPRT IIVNWQPPSE- -ANGKITGY - IIYYS 980 

Qy 823 QAPNNNRNITTNERAASVTLFHLVTGMT — YRIRVAARSNGGVG 864 

I : " I I I : :l I :: lh: hi 

Db 981 TDVNAEIHDWVIEPWGNRLTHQIQELTLDTPYYFKIQARNSKGMGPMSEAVQFRTPRAE 1040 

Qy 865 VSHGTSEVIMNQDTLEKHLAAQQENESFLYGLINKSHVPVIVIVAIL 911 

h I : I : : I I : : :|| I :: 

Db 1041 SSDKMPNDQASGSAGKGSRPVDVGPDYRPPLSGSNSPHGSPTSPLDSNMLLVIIVSVGVI 1100 
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Qy 912 1 1 FWI I IAYCYWRNSRNSDGKDRSFIKINDGSVHMASN NLW-- 



I :hl:l I : 



I I: I 



Db 1101 TIVIWIVAVFCTRRTTSHQKKKRAACKSVN ISHKYKGNSKDVKPPDLWIHHERLELKPI 1160 



Oy 956 AQNPNQNPMYNTAGRMTMNNRNGQALYSLTP 
::|: II: I II I :|| 
Db 1161 DKSPDPNPIMTD TPIPRNSQ — 1 



Qy 1014 AQLTGGPGNAMSTFYGNQYHDDPSPYATTTLvLSNQQPAWLNDKMLRAPAMPTNPVPPEP 1073 

:IM I : : II: II : Ihl 

Db 1205 SE DSMSTLAGRR f GMRPKMM — MPFDSQPPQP 1233 



Qy 1074 PARYADHTAGRRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTD' 



II: I II I 

Db 1234 VISAHPIHSLDNPHHHFHSGSLASPTRSY • ■ 

• 1120 VSYVQLHSSDGT- -G 
: II I 
1290 NTPSSDTMPASSSQPCADHQDPDSSSGAYLG 



Qy 1157 PSNPPPPGGHVYD-TATRRQLNRGSTPREDT T)- 

I I II I I I 

Db 1349 - 



■ -AVPAAGSAYDPTLPSTPLLTQQAPSHPV ISVKTASIGTLGR' 

Qy 1214 RPLKGKRDDDSQRSSLMMDDDGGSSEAD 12 

I: lh: |::| I I I 
Db 1394 MPVWPSAPDVQETTRMLEDSESSYEPD 14 



RESULT 9 
A53449 

plasmacytoma -associated neuronal glycopro 
C; Species: Mus musculus (house mouse) 
CjDate; 25-Aug-1995 isequence_revision 25 
C; Accession: A53449 

R;Connelly, M.A.; Grady^R.C; Mushinski, 
Proc. Natl. Acad. sci. U.S.A. 91, 1337-13 
A;Title: PANG, a gene encoding a neuronal 
A;Reference number: A53449; MUID: 94151325 
A;Accession: A53449 
A; Status: preliminary • 
A; Molecule type: mRNA 
A;Residues: 1-1028 <CON>! 
A; Cross -references: GB:L01991; NID:g20005 

*Superfamily: contactin; fibronectin tyj ; 
Keywords: glycoprotein 

Query Match j 9.0%; Score 

Best Local Similarity ' 24.9%; Pred. 
Matches 244; Conservative 135; Mis 



Qy 30 PVIIEHP— IDVWSRGSPATLNCGAKPS- 

' II I: I I I I Mil I: : 
Db 26 



5 PVFIKEPSNSIEJJVDSEDKKITLNCEARGNP JPHYRWQLNGSDIDTSLD- 
Qy 86 GSLFLLKVNSGKNGRDSDAGAYYCVASNEHG3VKSNEGSLKLAMLREDFRVRPR-TVQAL 144 



Db 



1:1 :: I 
82 GNLIVINPN-- 



: I 1:1 I hi I 



I I 



:|| 



■NAQDFFNNCDDYSGTMHRPGSEHHYHY 1013 
1:1 :|: : : I 

JNSMD SMHQRRNSYRGHE 1204 



III: I I hi 
■LHHQVSPWPVGTSMSHSDRANSTESVR 1289 

ISKERTGERRTP PNKTLMDFIPPP 1156 

1:1 hill 
IAQEEDAAQSLPTAHVRPSHPLKSFAVP- 1348 



I - - SVSDGAFARVDVNARPTSRNRNLGG 1213 

III II 

: — TRPP 1393 



ein PANG - mouse 

Aug-1995 ftext_change 24 -Sep-1999 

J.F.; Marcu, K.B. 
1, 1994 

glycoprotein, is ectopically activated by inti 



PIDN:AAA17403.1; PID:g200057 
III repeat homology; immunoglobulin homology 



.6; DB 2; Length 1028; 
\3e-26; 

es 421; Indels 178; Gaps 35; 



PAKITWYKDGQPVITNKEQVNSHRIVLDT 85 
I :| : I: : II I: 

HRYKLNG 81 



II I: I I hh I I II 



RNWDTGSYQCFATNSLGTIVSREAKLQFAYL-ENFKIRMRSTVSVR 135 



Qy 145 GGEMAVLECSPPRGFPEPWSWRKDDKELRI-QDMPRYTLHSDGNLIIDPVDRSDSGTYQ 203 

1:11111 I :l :: : :| I hi I I: II I I 
Db 136 EGQGWLLCGPPPHSGELSYAWVFNEYPSFVEEDSRRLVSQETGHLYIAKVEPSDVGNYT 195 

Qy 204 CVANNMVGER- -VSNPARLSVFE KPKFE-QEPKDMTVDVGAAVLFDCRVTGDP 253 

II : I : :| I : :|| I I I: : I: I :l hi 

Db 196 CWTSTVTNTRVLGSPTPLVLRSDGVMGEYEPKIEVQFPETLPAARGSTVRLECFALGNP 255 

Qy 254 QPQITWKRKN-EPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRV 312 

III 1:1 : I I : I I I I: I I I I Ml I I 
Db 256 VPQINWRRSDGMPFP-NKIKLRKFNGMLEIQRFQQEDTGSYEGIAENSRGKNVARGRLTY 314 



313 QAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVS 372 

II: I : : :|| hi 1:1 I I I :| : I :: 
315 YAKPYWLQLLRDVEIAVEDSLYWECRASGKPKPSYRWLKNGDALVL EERIQIE 367 

373 PTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKAT 410 

I III : 11:1 II I I II 
368 -NGALTITNLNVTDSGMFQCIAENKHGLIYSSAELKWASAPDFSRNPMKKMVQVQVGSL 426 

411 -FETKGRVQ KKKSKMGKQKQK NVQSIIK YLISAV — 443 

: I I III:::: : :: I I :| 

427 VILDCKPRASPRALSFWKKGDMMVREQARVSFLNDGGLKIMNVTKADAGTYTCTAENQFG 486 

444 — TGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGI--SWLRDGLPIDITD 497 

I : II I : II I lllll I I :| :l I 
487 KANGTTHLWIEPTRIILAPSNMDVAVGESVILPCQVQHDPLLDIMFAWYFNGALTDFKK 546 

498 SRISQHSIGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRM 552 

:: hi I I ::: :| I h : : :| I I 
547 DGSHFEKVGGSSSGDLMIRNIQLKHSGKYVCMVQTGVDSVSSAAELIVR 595 

553 PDPSNFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSP-DLG-QTWFNIP 61C 

I I : :lll :| I I |: I :| :| :| h :l 
596 — GSPGPPENVKVDEITDTTAQLSW-TEGTDSHSPVISYAVQARTPFSVGWQSVRTVP 55 C 

611 DYVASTEY--RIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKM 668 

: : : •: I I I I I I I : I I 1 1 : I I I : I : : M : : 
651 EVIDGKTHTATWELNPWVEYEFRIVASNKIGGGEPSLPSEKVRIEEAAPEIAPSEVS" 708 

669 DMAIAEKRLTSEQLIKLEEVKTINSTAVRLFWKKRKLEELID— -GYYIKWRGPPRTND 724 

II :l : I III : II : :| I 

709 — -GGGGSRSELVITWDPVP EELQNGGGFGYWAFRPLGVTTW 748 

725 NQYVNVTSPSTENYWSN--LMPFTNYEFFVIPYHSGVHSIHGAPSNSM3VLTAEAPPSL 782 

I I llll II I ::lh II I 1= : I I I :|| I- 
749 IQT V - VT SPDNPR YVFRNES I VPF SP YEVK VGVYN - - -NKGEGPFSPVTTVFSAEEEPTV 804 

783 PPEDVRIRMLNLTILRISWKAPKADGINGILKGFQIVI - -VGQAPNNNRNITTNERAASV 840 

I : I': : : :|| II I |::: I "I : I 

805 APSHISAHSLSSSEIEVSWNTIPWKLSNGHLLGYEVRYWNNGGEEESSRKVKVAGNQISA 864 

841 TLFHLVTGMTYKIRVAARSNGGVGVSHGT SEVIMNQD 877 

I I : : I I I :: I I I ::|::l : 

865 VLRGLKSNLAYYTAVRAYNSAGAGPFSAIVNAITKKTPPSQPPGNWWNATDTKVLLNWE 924 

878 TLEKHLAAQQENESFLYG 895 

:: llll : I 
925 QVK AMENESEVTG 937 



RESULT 10 

T13822 : 

frazzled gene protein - fruit fly (Drosophila melanogaster) 
C; Species: Drosophila melanogaster 

C;Date: 20-Sep-1999 jftsequence_revision 20-Sep-1999 ttext.change ll-May-2000 
C; Accession: T13822 • 

RjKolodziej, P.A.; Timpe, L; Mitchell, K.J.; Goodman, C.S.; Fried, S.; Jan, L.Y.; Ja 
Cell 87, 197-204, 1996 

A;Title: Frazzled encodes a Drosophila member of the DCC immunoglobulin subfamily and 
A;Reference number: Z17780; MUID:97015076 
A; Accession: T13822.' 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1375 <SOL> 

A; Cross-references: EMBL:U71Q01; NID:gl621114; PID:gl621115; PIDN:AAC47314.1 

C; Genetics: 

A; Gene: frazzled ! , 

A; Map position: 2 4 

C; Function: 

A; Description: may function in vivo as a receptor or component of a receptor mediatin 

Query Match - 8.8%; Scofe 607; DB 2; Length 1375; 
Best Local Similarity 21.6%; Pred. No. 3e-25; 
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Matches 304; Conservative 182; Mismatches 



Indels 434; Gaps 60; 



Qy 36 PIDWVSRGSPATLNC *-"GAK PST AKITWY - KDGQPVITNKEQVNSHRI 81 

Mill II ill II: I I III :: : : I 
Db 41 PQDAWPEGHSVLLQCAGTASIGRGGKSKSNLPSSVSIRWRGPDGQDLVIVGD— TFRT 97 

82 VLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEH-GEVKSNEGSLKLAM- - " LREDFRVR 137 

III:: I : j III h : I I : I : : I :|| 
8 QLKNGSLYISSVEENR — GflTGAYQCLLTAEGVGSILSRPALVAIVRQPDLNQDF- • - 150 



Qy 
Db 

Qy 
Db 
Qy 
Db 

Qy 

Db 

1 Qy 

Db 

Qy 

Db 

I 

Db 

Qy 
Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 

Qy 
Db 



138 PRTVQALGGEMAVLECS • PPRj 

11:1 I 
151 -LETYLLPGQTAYFRCMLGEJ 



'PEPV- ■ -VSWRKDDRELRIQDMPRYTLHSDGNLIIDP 193 
II I I III I : I I : :| I II 
JEGVKHSVQWLKDDLPLPL-DRLRMWLPNGALEIDE 208 



194 VDRSDSGTYQCVANNMVGERVSNPARLSVFE KPKFEQEPRDMTVDVGAAV 243 

I II MM : |:|: |:: : I I I II I I 

209 VGPSDRGSYQCNVTSGSSSRLSSRTNLNIRKPSDPGVENSVAPSFLVGPSPKTVREGDTV 268 
I 

244 LFDCRVTGDPQPQITWRRKNEPMPVTRAYIAKDNR GLRIERVQPSDEGEYVCY 296 

II I Ml II 1 : : |:| |:| : I I I I 

269 TLDCVANGVPKPQ I KWLRNGMDLD — FNDLDSRFS I VGTGSLQ ISSAEDIDSGNYQCR 324 

297 ARNPAGTLEASAHLRVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQD 356 

I I :hl I ::ll III I I : =1 : |:| I I I I I 

3;25 ASNTVDSLDAQATVQVQEPPKFIRAPKDTTAHEKDEPELKCDIWGKPKPVIRWLKNG- -D 382 

3*57 LLFPS - YVS - ADGRTKVSPTGTLT IEEVRQVDEGAYVCAGMNSAGSSLSKAAL 407 

I: I: I: II I I : I I : I I Mil : I I 

383 LITPNDYMQLVDGH NLKILGLLNSDAGMFQCVGTNAAGSVHAAARLRWPQGD 435 

408 KATFET 413 

I: I I 

436 SPEQDPSVPHPGGKPLDSGLQARLPSQPRDLVAQIVKSRFVTLSWVEPLQNAGDWYYTV 495 

414 KGRVQKKKSKMGKQRQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVG 467 

I II :| :| Mh: I I : : I 

496 YYRMNNSEREQRMVTKSHDDQQVNIQSLL PGRTYQFRVEANTNFGS 541 

468 SSAILPCQASGKPTPGIS WLRDGLPI DITDSRI SQHST 505 

:: I : I :| I: : I I :|; I |:: : 

542 GASSAPLEVSTQPEVNIAGPPRNFEGYARSHKEIYVKWEEPTVTNGEILRYRVYYSENDS 601 

506 GSLHIADLKKPDTG VYTCIAKNEDGESTWSASLTVEDHTSNAQFVR 551 

I III I I:: hi 11:1: :l 
602 G — ADLYHDSTALEAVITELRPHTDYVISWPFNRNGMGDSSAEIRVKTFSST 652 

552 MPDPSNFPSSPTQPI IVNVT • DTEVELHWNAPSTSGA- GPITGY I IQYY — SPDLGQT 605 

III : Ml : : M |: I III |:| :| : I 
653 PSEPPNNVTLEVTSSSSITVHWEPPAEEDRNGQITGYKIRYRKFKDAPQVKST 705 

606 WFNIPDYVASTEYRI 620 

:| : : : II:: 

706 PANIRYFELSNLDRNAEYQVKIAAMTVNGSGPFTEWNRANTLENDLDETQVPGKPIWISI 765 

621 RGIRPSHSYMF 631 

J I: : I: 

766 HPGANNIALHWGPPQHPEIKIRNYVLGWGRGIPDENTIELKETERYHILRNLESNMDYVV 825 

632 VIRAENERGIGTKVSSALVTTSKPAAQVALSDKNKMDMAIAERRLTSEQLIKLEEVKTI 691 

:|| I II IT : :l : :::::: |: 
826 SLRARNVKGDGPPIYDNIKTRDEEP VDAPTPLEVPVGLRAI TM 868 

692 NSTAVRLFW- - 'KKRKLEELIDG-YYIKWRGPPRTNDNQYVNVTSPSTENYWSNLMPFT 747 

:|::: ::|. MM :| I :| :| I I I ::::| I I 
869 SSSSIWYWIDTMLNKNQHVTDNRHYTVSYGITGSNRYRYHNTTD---LNCMINDLRPNT 925 

748 NYEFFVIPYHSGVHSIHG- -APSNSMDVL- -TAEAPPSLPPEDVRIRM* -LNLTTLRISW 801 

III I : I I II II I : I II :| M :| I: : I 

926 QYEF AVKWKGRRESSWSMSVLNSTYQNVPVTPPREVTVRLDEMNPPTVIVQW 978 

802 RAPRADGINGILKGFQIVIVGQAPNNNRNITTNERAASVTLF— HLVTGMTYRIRVAAR 858 

II I : I: I :|: : I h :| II :| II 

979 IPPK--HTLGQITGYNIYYTTDTTKRDRDWSVEAFAGEETMLMLPNLRPYTTYYFRVQAR 1036 



Qy 859 SNGGVG VSHGTSEVIMNQ--DTLEKHLAAQQENESFLYGLINKSHVPVIVIVA 909 

: I II: II : I IIM : :|| II hi I 

Db 1037 TTRGANNAPFSALVSYTTSAAVTMQEPDT IAKGI — DNERLLY IIIAA 1082 

Qy 910 ILIIFWI"'"IIAYCYWRNSRNSDGRDRSFIRINDGSVHMASNNLWDVAQKPNQHPMYN 966 

:|: I : : : :|: I I I I :: 

Db 1083 TAWLLWLLGVLLLCRRRPQSSPEHTKKSYQRNNVGV PRPPDLWI 1128 

Qy 967 TAGRMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTGGPGNAMST 1026 

:| M : : h-ll I 1 1 : I I I I I 1 1 : : : 

Db 1129 HHDQMELKNID ■ KGLHTVTPVCSDGASS SGALTLPRSWHSEYEVET PVPGHVTNS 1183 

Qy 1027 FYGNQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPTNPVPPEPPARYADHTAGRRS 1086 

I III: :| I I IM I: 
Db 1184 LDKRSY— VPGYMTTS MNGTMER PQYP RT 1210 



Qy 1087 RSSRASDGRGTLNGGLHHRTSGSQRSDS PPHT — DVS YVQL - HSSDGTGSSRER 1137 

: I : I: II :: Ml II I :: ::: I I 
Db 1211 QYSHQtfRSHMTMEAGLSQQSLTQPQSNSMAQTPEHPYGGYDANFCNAGNAAAGNGCVSTI 1270 

Qy 1138 TGERRT PPNKTLMDF IPPPPSNPPPPGG 1165 

:| 1,111 III II 
Db 1271 ESSKRGHP---LRSFSVP— GPPPTGG 1292 



RESULT 11 
T08851 

Down syndrome cell adhesion protein 1 - human (fragment) 
N; Alternate names: Down syndrome cell adhesion molecule 
C; Species: Homo sapiens (man) 

C;Date: ll-Jun-1999 ftsequence..revision ll-Jun-1999 ttext_change ll-Jun-1999 
C; Accession: T08851 '• 

R;Yamakawa, K.; Huo,'Y.K.; Haendel, M.A.; Hubert, R. ; Chen, X.N.; Lyons, G.E.: Ko::' 
submitted to the EMBL Data Library, September 1997 

A; Description: DSCAM; A novel member of the immunoglobulin superfamily maps in a cr/> 
A;Reference number: Z16495 
A; Accession: T08851 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A;Residues: 1-1896 <YAM> 

A; Cross -references: EMBL:AFQ23449; NID:g3169765; PID:g3169766 

A; Experimental source: brain; developmental stage: 14 weeks; fetal 

C; Genetics: 

A; Gene: DSCAM 

A; Map position: 21q22 

A; Note: derived fronualternately-spliced mRNA 
C; Function: 

A; Description: involved in nervous system development 
C; Keywords: alternative splicing 



Query Match ■ 8.7%; Score 598; db 2; Length 1896; 

Best Local Similarity 25.2%; Pred. No. 1.4e-24; 

Matches 230; Conservative 131; Mismatches 365; Indels 188; Gaps 41; 

Qy 16 YI NFDK I PNAS NLAPVI I E HP I DVWSRGSPATLNCGAKPS -TAKITWYKDG 66 

:: II: :l-: Ml Mil I :| I I : III I 

Db 372 FVRKDKL-SAQDYVQWLEDGTPKIISAFSEKWSPAEPVSLMCNVKGTPLPTITWTLDD 430 

Qy 67 QPVITNREQVNSHRI— VLDTGSLF-LLKVNSGKNGKDSDAGAYYCVASNEHGEVRSNE 122 

M llll ; M I M : I I I I |:| I I 
Db 431 DPILKG— -GSHRISQMITSEGNWSYLKISS— SQVRDGGVYRCTANNSAGW---- 479 

Qy 123 GSLRLAMLREDFR — VRP-RTVQALGGEMAVLECSPPRGFPEPWSWRRDDRELRIQD 177 
IIM : 1 1 : : I : I M |:| M I: I 

Db 480 -— LYQARINVRGPASIRPMKNITAIAGRDTYIHCR-VIGYPYYSIKWYRNSNLLPFNH 534 

Qy 178 MPRYTLHSDGHLI IDPVDRS ■ DSGT YQCVANNMVGERVSN PARLSVFEKP 226 

: ::|'| : |: I I I I Ml ::| I : II I 

Db 535 R-QVAFENNGTLKLSDVQKEVDEGEYTC--NVLVQPQLSTSQSVHVTVRVPPFIQPFEFP 591 

Qy 227 RFEQEPKDMTVDVGAAVLFDC-RVTGDPQPQITWKRRNEPMPVTRAYIAKDN RGLR 281 



Best Available Copy 

Mon Jan 22 13:04:34 2001 us-09-540-245a-17.rpr Page 9 



: I I |:|| III:: |:| : : II II 
■-SIGQRVFIPCVVVSGDLPITITWQKDGRPIPGSLG-VTIDNIDFTSSLR 641 



Qy 282 IERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVPAGGTATFECTLVG 341 

I : M I III I :| : I h II I :l II I I: I 
Db 642 ISNLSLMHNGNY1CIARNEAAAVEHQSQLIVRVPPKFWQPRDQDGIYGKAVIINCSAEG 701 

Qy 342 QPSPAYFW--SREGQQDLLFPSYVSADGRTRVSPTGTLTIEEVRQVDEGAYVCAGMNSAG 399 

Mill I :: :ll :l |:| |: I : II hi I I 

Db 702 YPVPTIVWKPSKGAGVPQFQP- - IALNGRIOVLSNGSLLIKHWEEDSGYYLCKVSNDVG 759 

Qy 400 SSLSKAALKATFETKGRVQKKKSKMGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGH 459 

: :||: II : I : : 

Db 760 ADVSKS MYLTVKI PAMITSY 779 



I 

Db 

Qy 



460 QNQTLMV-GSSAILPCQASGRPTPGISWLRDGLPID ITDSRISQHSTGSLHIA 511 

I II I : I I I: : I :: I: :: : : :| I ■ 
780 PNTTLATQGQKKEMSCTAHGEKPIIVRWEKEDRIINPEMARYLVSTKEVGEEVISTLQIL 839 

512 DLKKPDTGVYTCIMNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPIIVNVT 571 

: I: ::! I I II III:: I I : I :| 

1 840 PTVREDSGFFSCHAINSYGEDRGIIQLTVQE PPDPPEIEIKDVK 883 

572 DTEVELHWNAP$SGAGPITGYIIQYYSPDLGQTWFNI PDYVASTEYRIKGLK 624 

: I I I Ml I: : :| : I ::| I : 

884 ART ITLRWTM • GFDGNS PITG YD IE - -CKNRSDSTOSAQRTKDVSPQLNSAT - - -IIDIH 937 

625 PSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIK 684 

II :| : 1:1 II I: I I: II I : 

938 PSSTYSIRMYAKNR- - IGKSEPSNELTITADEAAPDG PPQEVH 97B 

685 LEEVKTINSTAVRLFWK-RRKLEE-LIDGYYIKWRGPPRTNDNQYVNV— -TSPSTEN 737 

II hi ::h II h h :| II I :| II h II :| 
979 LE— PISSQSIRVTWKAPKKHLQNGIIRGYQIGYR-EYSTGGNFQFNIISVDTSGDSEV 1034 



Qy 738 YWSNLMPFTNYEFFVIPYHSGVflSIHGAPSNSMDVLTA-EAPPSLPPEDVRIRMLNLT 795 

I = II II I I : I :| :::| I II M I : I : : 

Db 1035 YTLDNLNRFTQYGLWQACNRA GTGPSSQEIITTTLEDVPSYPPENVQAIATSPE 1089 

Qy 796 TLRISWRAPKADGINGILKGFQIVIVGQAPNNN — RNITTNERAASVTLFHLVTGMTY 851 

:: III : :llll:||::: : Hlfl : h I I I 
Db 1090 SISISWSTLSKEALNGILQGFRVIYWANLMDGELGEIKNITTTQ-PSLELDGLEKYTNY 1147 

Qy 852 KIRVAARSNGGVGV 865 

1:1 I : I II 

^b 1148 SIQVLAFTRAGDGV 1161 > 



RESULT 12 i 
JU0094 

Fll protein precursor - chicken 
C; Species: Gallus gallus (chicken) 

C;Date: 07-Jun-1990 tsequence_revision 07-Jun-1990 ftext_change 31-Jan-2000 
C; Accession: JU0094 

R;Brueramendorf, T,; Wolff, J.M.; Frank, R,; Rathjen, FX. 
Neuron 2, 1351-1361, 1989 
A; Title: Neural cell recognition molecule Fll; homology Mth fibronectin type III and in 
AjReference number: JU0094; MOID: 90180453 | 
A; Accession: JU0094 | 
A;Molecule type: mRNA | 
A; Residues: 1-1010 <BRU> J, 
A;Cross-references: GB:X14877; NID:gl708784; PIDN:CAA33018.1; PID:g63385 
A;Note: the carboxy-end hydrophobic stretch is compatible with the consensus motif for 
A; Note: fll comprises six domains related to the immunoglobulin domain type C and four 

CAM 

C; Comment: Fll is a chick neural cell surface-associated glycoprotein implicated in neui 
C;Superfamily: contactin; fibronectin type III repeat homology; immunoglobulin homology 
C; Keywords: blocked carboxyl end; glycoprotein; lipoprotein; phosphatidylinositol linkaq 
F;l-20/Dotoain: signal sequence tstatus predicted <SIG> 
F;21-984/Product: protein Fll tstatus predicted <MAT> 
F;247-303/Domain: immunoglobulin homology <IMM> 
F;985-1010/Domain: carboxyl -terminal propeptide tstatus predicted <CTP> 
F;200,249,329,448,464,485,512,582,621,924/Binding site: carbohydrate (Asn) (covalent) k 



F;984/Modified site:' GPI-anchor ethanolamine amidated carboxyl end (Ser) (in cat'. 



Query Match '. 8.7%; Score 597; DB 2; Length 1010; 

Best Local Similarity 23.6%; Pred. No. 6.7e-25; 

Matches 268; Conservative 140; Mismatches 364; Indels 364; Gaps 

Qy 6 FYHTHTHTHTYI NFDKIPNASNLAPVIIEHPIDVW- - -SRGSPATLNCGAKP 55 

I: :| I :| :| : I II I III : I ::M h 
Db 3 FFISHLVTLCFIFCVADSTHFSEEGN-KGYGPVFEEQPIDTIYPEESSDGQVSMNCRAR- 60 

Qy 56 STAKnWYKDGQPVITNKEQVNSHRIVL-DTGSLFLLKVNSGKNGKDSDAGAYYCVASN 113 

' .1 II ::h MM::: | ||| | || 
Db 61 AVPFPTYKWKLNNWDIDLTKDRYSMVGGRLVISNPEKSRDAGKYVCWSN 110 

Qy 114 EHGEVKSNEGSLKLAML REDFRVRPRTVQALGGEMAVLECSPPRGFPEPV-VSWR 167 

I 1:1:1 ::l I Ml: I III I II :|: : I 

Db 111 IFGTVRSSEATLSFGYLDPFPPEEHYEVKVRE GVGAVLLCEPPYHYPDDLSYRWL 165 

Qy 168 KDDKELRIQ - DMPRYTLHSDGNLIIDPVDRSDSGTYQCVANNMVGERVSNPA - RLSVFEK 225 

:: : I i h -III I Ml I I IIM III I 

Db 166 LNEFPVFIALDRRRFVSQTNGNLYIANVEASDKGNYSCF VSSPSITKSVFSK 217 

Qy 226 PKFEQE — PKDMTVD VGAAVLFDCRVTGDPQPQITWKRRNEPMPVT- 269 

I: :: I IM :l I :| hi h: I : Ml I 

Db 218 FIPLIPQADRAKVYPADIKVKFKDTYALLGQNVTLECFALGNPVPELRWSKYLEPMPATA 277 

Qy 270 ': RAYI ARD— 276 

I h II 
Db 278 EISMSGAVLKIFNIQYEDEGLYECEAENYKGRDKHQARVYVQASPEWVEHINDTEKDIGS 337 

Qy 277 : NRG-LRIERVQPSDEGEYVCYARNPAGTLEASAHL 310 

: III: : I II I II I : I : I i 
Db 338 DLYWPCVATGKPIPTIRWLRNGVSFRKGELRIQGLTFEDAGMYQCIAENAHGIIYANAEL 397 

Qy 311 RVQA-PPSFQIKPADQSVPA-GGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADG 367 

:: I 11:1:'. I : M II II I M III I M: I 
Db 398 KIVASPPTFELNPMRKRILAAKGGRVIIECKPKAAPKPKFSWSK-GTELLVNGS 450 

Qy 368 RTRVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSRAALRATFETRGRVQRRKSKMGKQ 427 

I : 'hi' I I Ml || |M M h I I: 
Db 451 RIHIWDDGSLEIINVTRLDEGRYTCFAENNRGKANSTGVLEMTEATR 497 

Qy 428 RQRNVQSIIKYLISAVTGNTPARPPPTIEHGHQNQTLMVGSSAILPCQASGRPTPGIS-- 485 

' I I : II :| M II II :: 

Db 498 ITLAPLNVDVTVGENATMQCIASHDPTLDLTFI 530 

Qy 486 WLRDGLPIDIT DSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLT 539 

I :| IM : : | | | |::: I III h h II I 
Db 531 WSLNGFVIDFEREHEHYERNVMIKSNGELLIRNVQLRHAGRYTCTAQTIVDNSSASADLV 590 

Qy 540 VEDHTSNAQFVRMPDPSNFPSS PTQPII VNVTDT EVELHWNAPSTSGAGPITG Y I IQ YYS 599 

I! I II I : II I I I: I II: III 
Db 591 VRGP PGPPGGIRIEEIRDTAVALTWSR-GTDNHSPISKYTIQ-SR 633 

Qy 600 PDLGQTWFNIPDYVASTE YRIRGLRPSHSYMFVIRAENERGIGTPSVSSA 649 

Ml:'' Ml I: I I II I II I I II: I 

Db 634 TFLSEEWRD----AKTEPSDIEGNMESARVIDLIPWMEYEFRIIATNTLGTGEPSMPSQ 688 

Qy 650 LVTTSRPAAQVALSD KNKMDMAIAER RL 677 

: I MM I :| I h 

Db 689 RIRTEGAPPNVAPSDVGGGGGSNRELTITWMPLSREYHYGNNFGYIVAFRPFGEREWRRV 748 

Qy 678 T - SEQLIRLE EV 688 

Db 749 TVTNPEIGRYVHKDESMPPSTQYQVRVRAFNSRGDGPFSLTAVIYSAQDAPTEVPTDVSV 808 



689 KTINSTAVRLFWKRRRLEELIDGYYIK-WRGPPRTNDNQYVNVTSPSTENY--WSNLMP 745 

I ::h : : I h ::ll IM : III Ml MM 
809 KVLS S SEI SVSW - HHVTEKSVEGYQ IRYWAAHDKEAAAQRVQV - - - SNQEYSTKLENLKP 864 

746 FTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLR I 799 

I I I M : :| II ::|::| Mill I II ::::! I 



t 



Best Available Copy 

Mon Jan 22 13:04:34 2001 us-09-540-245a-17.rpr Page 10 



Db 865 NTRYHIDVSAFNS - - - AGYGPPSRTIDI ITRKAPPSQRP- - -RI — ISSVRSGSRYII 914 

Qy 800 SWKAPKADGINGILKGFQIVIVGQAPNNNRNITTNERAASVTLFHLVTGMTYKIRVAARS 859 

:| II ::|:::: : : :| : | : : I : I ! I 
Db 915 TWDHVKAMSNESAVEGYKVLYRPDGQHEGRLFSTGKHTIEVP— VPSDGEYWEVRAHS 971 

Qy 860 NGGVG VSHGTSEVIMNQDTLEKHLMQQENESFLYGLINKSHVPVIVIVA 909 

III :l hi : I ||: :| : ::| 

Db 972 EGGDGEVAQ I K I SGAT AGV PTLLLGLV — LPALGVLA 1006 



RESULT 13 
S05944 

neuronal cell surface protein F3 precursor ■ mouse 
C;Species: Mus musculus (house mouse) 

C;Date: Ol-Dec-1989 fsequence_revision 01-Dec-1989 ttext_change 21-Jan-2000 
C; Accession: S05944 

R;Gennarini, G.; Cibelli, G.; Rougon, G.; Mattel, M.G.; Goridis, C. 
J. Cell Biol. 109, 775-788, 1989 

A;Title: The mouse neuronal cell surface protein F3: a phosphatidylinositol -anchored men 
A,-Reference number: S05944; MUID:89340657 

•accession: S05944 
; Molecule type: mRNA 
Residues: 1-1020 <GEN> 

A;Cross-references: EMBL:X14943; NID:g50937; PIDN:CAA33075 .1; PID:g50938 

C; Genetics: 

A; Map position: 15F 

C;Sup$rfamily: contactin; fibronectin type III repeat homology; immunoglobulin homology 
F;l-20/Domain: signal sequence tstatus predicted <SIG> 
;F;256-312/Domain: immtlnoglobulin homology <IMM> 



Query Match 8.7%; Score 595; DB 2; Length 1020; 

Best Local Similarity 24.0%; Pred. No. 8.7e-25; 

Matches 263; Conservative 127; Mismatches 381; Indels 324; Gaps 43; 



4 LGFYHTH-THTHTYINFDKIPNASNLAPVIIEHPIDVWSRGS—PATLNCGAKPSTAK 59 

II = I : I II I: I II: : I :||| h I 

19 LGDFTWHRRYGHGVSEEDK GFGPIFEEQPINTIYPEESLEGKVSLNCRARASPFP 73 

60 ITWYK — DGQPVITNKEQVNSHRIVLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEH 115 

: II :l :l I : |:| : I ||| |||:||| : 

74 V- -YKWRMNNGDVDLTN DRYSMVGGNLVI NNPDKQRDAGVYYCLASNNY, 120 



116 GEVKSNEGSLKLAMLREDFRVRPR-TVQALGGEMAVLECSPPRGFPEPV-VSWRKDDKEI 

I hi I =1 1:1 I I: I: II I II II: : I : 
121 GMVRSTEATLSFGYL-DPFPPEERPEVKVKEGKGMVLLCDPPYHFPDDLSYRWLLNEFP^ 



Qy 174 RI - QDMPRYTLHSDGNLI IDPVDRSDSGTYQCVANNMVGERVSNPA" RLSVFER j 225 

* III: ::||| I |: II II I l|:|: III I 

m> 180 FITMDKRRFVSQTNGNLYIANVESSDRGNYSCF VSSPSITKSVFSKFIPLIP 231 

Qy 226 -PKFEQEP KDMTVDVGAAVLFDCRVTGDPQPQITWRRRNEPMPVTRAYIAK 275 

I: H II: :l I :l hi I I I:: III! I I I: 

Db 232 IPERTTKPYPADIWQFKDIYTMMGQNVTLECFALGNPVPDIRWRKVLEPMPST-AEIST 290 

Qy 276 DNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTRPADQSVPAGGTATF 335 

1:1 :| III I I I I I : I : III I : III : 
Db 291 SGAVLKIFNIQLEDEGLYECEAENIRGKDKHQARIYVQAFPEWVEHINDTEVDIGSDLYW 350 

Qy 336 ECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGM 395 

I 1:1 I II I II I I : :| : I I I 

Db 351 PCIATGRPIPTIRWLKNGY SY HKGELRLYDVTFENAGMYQCIAE 394 

Qy 396 NSAGSSLSKAALR ATFETKGRVQKRKSRMGKQKQKNVQSIIKYLISAVTGNTPAK 450 

I: II : I II III I |:| :::| 
Db 395 NAYGSIYANAELRILALAPTFE MNPMKKK ILAA 427 

Qy 451 PPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQHSTGSLHI 510 

I I: I: I I II : I : III III I 
Db 428 KGGRVIIECKPRAAPRPRFSWSR-GTEWLVNSSRILIWEDGSLEI 471 



Qy, 511 ADLRKPDTGVYTCIARNEDGESTWSASLTVEDHT-- 



•-SNAQFVRMPD 554 



:: : I |:||| hi |:: : :| : : I : | | 

Db 472 NNITRNDGGIYTCFAENNRGKANSTGTLVITNPTRIILAPINADITVGENATMQCAASFD 531 

Qy 555 PS NF 558 

h ' II 

Db 532 PALDLTFVWSFNGYVIDFNKEITHIHYQRNFMLDANGELLIRNAQLKHAGRYTCTAQTIV 591 

Qy 559 PSS PTQPI IVNVTDTEVELHWNAPSTSGAGPITGY I IQ Y YSPDLGQ 604 

'.II I :: I I I |: I : : ||: I || 
Db 592 DNSSASADLWRGPPGPPGGLRIEDIRATSVALTWSRGSDNHS-PISRYTIQ-TKTILSD 649 

Qy 605 TWFNI--PDYVASTEYRIKG-LKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQ 659 

I : I ; : I II I I : I I I I 1 1 : I : I I 
Db 650 DWKDARTDPPIIEGNMESARAVDLIPWMEYEFRWATNTLGTGEPSIPSNRIKTDGAAPK 709 

Qy 660 VALSD ! - KNKMDMAIAERRLTSEQLIKLE 686 

II II ; I III I: h 

Db 710 VAPSDVGGGGGTNRELTITWAPLSREYHYGNNFGYIVAFRPFDGEEWRKVTVTNPDTGRY 769 

Qy 687 • EVRTINSTAVRL 698 

S II ::h : : 

Db 770 VHKDETMTPSTAFQVKVKAFNNKGDGPYSLVAVINSAQDAPSEAPTEVGVKVLSSSEISV 829 

Qy 699 FWKKRKLEELIDGYYIR-WRGPPRTNDNQYVNVTSPSTEKYV--VSNLMPFTNYEFFVIP 755 

I I Ih::: I h I I : I III : I : Ihl I I I 
Db 830 HW-KHVLERIVESYQIRYWAGHDKEAAAHRVQVTS— QEYSARLENLLPDTQYFIEVGA 885 

Qy 756 YHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLR ISWRAPRADGI 809 

:| : I h :: I Hill II II ::::| hi I 
Db 886 CNS— AGCGPSSDVIETFTRRAPPSQPP— RI — -ISSVRSGSRYIITWDHWALSN 935 

Qy 810 NGILRGFQIVIVGQAPNNNRNITTHERAASVTLFHLVTGMTYRIRVAARSNGGVGVSHGT 869 

: h:h-; :: : :|:: : I : I : I I hll II 
Db 936 ESTVTGYKILV'RPDGQHDGKLFSTHRHSIEVP- ' -IPRDGEYWEVRAHSDGGDGV* - -V 989 

Qy 870 SEV-LMNQDTLEKHL 883 

hi I II I 
Db 990 SQVKISGVSTLSSSL 1004 



RESULT 14 : 
A57112 

contactin precursor - rat ; 
C; Species: Rattus. norvegicus (Norway rat) 

C;Date: 03-Nov-1995 frsequencejrevision. 03-Nov-1995 itext_change 21-Jan-2000 
C; Accession: A57112 

R;Peles, E.; Nativ, M,; Campbell, P.L.; Sakurai, T.; Martinez, R, ; Lev, S.; Clary, D. 
Cell 82, 251-260, 1995 

A; Title: The carbonic anhydrase domain of receptor tyrosine phosphatase beta is a fun 
AjReference number: A57112; MOID: 95354206 
A; Accession: A57112 • 

A; Status: preliminary; nucleic acid sequence not shown; not compared with conceptual 
A;Molecule type: mRNA 
A;Residues: 1-1021 <PEL> 

C;Superfamily: contactin; fibronectin type III repeat homology; immunoglobulin homolo 
C;Keywords: membrane; protein; phosphatidylinositol linkage 
F;256-312/Domain: imiiiunoglobulin homology <IMM> 



Query Match ' 8.7%; Score 595; DB 2; Length 1021; 

Best Local Similarity 23.8%; Pred. No, 8.7e-25; 

Matches 257; Conservative 125; Mismatches 371; Indels 326; Gaps 

Qy 4 LGFYHTH-THTHTYINFDRIPNASNLAPVIIEHPIDVWSRGS— PATLNCGAKPSTAK 59 

II : I :• I II hi Ih : I Mil h I 

Db 19 LGEFTWHRRYGHGVSEEDR GFGPIFEEQPINTIYPEESLEGKVSLNCRARASPFP 73 

Qy 60 ITWYK- --DGQPVITNREQVNSHRIVLDTGSLFLLRVNSGRNGRDSDAGAYYCVASNEH 115 

: II :i :ll I : hi : I III llhlll : 

Db 74 V--YRWRMNNGDVDLTN DRYSMVGGNLVI NNPDRQKDAGIYYCLASNNY 120 



116 GEVKSNEGSLKLAML- 

I hi I :|- I 



■ -REDFRVRPRTVQALGGEMAVLECSPPRGFPEPV-VSWRKDD 17 0 

II II h h II I II Ih : I :: 
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Db 121 GMVRSTEATLSFGYLDPFPPED- • -RPE-VKVKEGKGMVLLCDPPYHFPDDLSYRWLLNE 176 



.71 KELRI-ODMPRYTLHSDGHLIIDPVDRSDSGTYQCVANNMVGERVSNPA-RLSVFEK— 225 

: I I I: ::IH I I: II I I I ' I hi: III 
.77 FPVFITMDKRRFVSQTNGNLYIANVESSDRGNYSCF VSSPSITKSVFSKFIP 228 

26 PKFEQEP KDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMPVTRAY 272 

: :| II: :| I :| hi I I !•:: MM I I 

29 LIPIPERITRPYPADIWQFKDIYTMMGQNVTLECFALGNPVPDIRWRKVLEPMPTT-AE 287 

!73 IAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVPAGGT 332 

h hi :| III I I I I I : I : III I : '! I M 
!88 ISTSGAVLKIFNIQLEDEGLYECEAENIRGKDKHQARIYVOAFPEWVEHINDTEVDIGSD 347 

\ 

133 ATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVRQVDEGAYVC 392 

: I hi I I I I II :f:l : I I I 

148 LYWPCVATGKPIPTIRWLKNGY AYHKGELRLYDVTFENAGMYQC 391 

:93 AGMNSAGSSLSKAALK ATFETKGRVQKKKSKMGKQKQKNVQsllKYLISAVTGNT 447 

h h : I II III I hi |, :::| 
:92 IAENAYGTIYANAELKILALAPTFE MNPMKKK-— |— ILAA 427 

48 PAKPPPT1EHGHONQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQHSTGS 507 

I h h I I II : I : III II 
■28 KGGRVIIECKPKAAPKPRFSWSK-GTEWLVNSSRILIWEDGS 468 

LHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHT SNAQFVR 551 

I I : I hill hi h: : :| : : I : I 

■69 LEINNITRNDGGIYTCFAENNRGKANSTGTLVITNPTRIILAPINADITVGENATMQCAA 528 

152 MPDPS NF 558 

III II 
i29 SFDPSLDLTFVWSFNGYVIDFNKEITHIHYQRNFMLDAKGELLIRNAQLKHAGRYTCTAQ 588 

PSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPD 601 

II I I I I: I : : II: I I 

i89 IIVDNSSASADLWRGPPGPPGGLREDIRATSVALTWSRGSDNHS-PISKYTIQ-TKTI 646 

102 LGQTWFNI ■ - - PDYVASTEYRIKG - -LKPSHS YMFVIRAENEKGIGTPSVSSALVTTSKP 656 

II: I : I I I I I : I I I I II: I : I 
i47 LSDDWKDAKIDPPIIEGNMESAKAVDLIPWMEYEFRWATNTLGTGEPSIPSNRIKTDGA 706 

157 AAQVALSD KNKMDMAIAEKRLTSEQLIKLE 686 

I II II I :l I h I: 

'07 APNVAPSDVGGGGGTNRELTITWAPLSREYHYGNNFGYIVAFKPFDGEEWKKVTVTNPDT 766 

EVKTINSTA 695 

II ::h 

'67 GRYVHKDETMTPSTAFQVKVKAFNNKGDGPYSLIAVINSAQDAPSEAPTEVGVKVLSSSE 826 

VRLFWKKRKLEELIDGYYIK-WRGPPRINDNQYVNVTSPSTENYV- -VSNLMPFTNYEFF 752 
: : I I II:::: I h I I : I III : I : Ihl I I 
827 ISVlffl-KHVLEKIVESYQIRYWAGHDKEAAAHRVQVTS— QEYSARLENLLPDTQYFIE 882 



Qy 753 VIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLR ISWKAPKA 806 

I :| : I -I: :: I Mill II II ::::| hi I 
Db 883 YGACNS- - - AGCG PSSDVIETFT RKAPPSQPP — RI — ISSVRSGSRYIITWDHWA 932 

Qy 807 DGINGILKGFQIVIVGQAPNNNRNITTNERAASVTLFHLVTGMTYKIRVAARSNGGVGV 865 

: h:h :: : :|:: : I : I : I I hll II 
Db .933 LSNESTVTGYKILYRPDGQHDGKLFSTHKHSIEVP ■ ■ ■ IPRDGEYWEVRAHSDGGDGV 988 



RESULT 15 
S01998 

contactin precursor - chicken 
N; Alternate names: 130K glycoprotein 
C;Species: Gallus gallus (chicken) 

C;Date: 30*Sep-1989 isequence_revision 30-Sep-1989 #text_change 21-Jan-2000 
C; Accession: S01998 
R;Ranscht, B.; Dours, M.T. 
J. Cell Biol. 107, 1561-1573, 1988 
A;Title: Sequence of contactin, a 130-kD glycoprotein concentrated in areas of interneud 



A; Reference number: S01998; MOID: 89008597 
A;Accession: S01998 : 
A;Molecule type: tnRNA 
A;Residues: 1-1091 <RAN> 

A; Cross -references: EMBL:Y00813; NID:g63328; PIDN:CAA68753.1; PID:g63329 

A;Note: part of this 'sequence, including the amino end of the mature protein, was con 

C; Super family: contactin; fibronectin type III repeat homology; immunoglobulin homolo 

C; Keywords: glycoprotein; transmembrane protein 

F;l-20/Domain: signal sequence tstatus predicted <SIG> 

F;21-1091/Product; contactin tstatus predicted <MAT> 

F;21-982/Domain: extracellular tstatus predicted <EXT> 

F;247-303/Domain: immunoglobulin homology <IMM> 

F;983-1002/Domain: transmembrane tstatus predicted <TMM> 

F;1003-1091/Domain: intracellular tstatus predicted <INT> 



Query Match , 8.7%; Score 593.5; DB 2; Length 1091; 

Best Local Similarity 23.9%; Pred. No. 1.2e-24; 

Matches 259; Conservative 133; Mismatches 356; Indels 337; Gaps 44; 

Qy 6 FYHTHTHTHTYI NFDKIPNASNLAPVIIEHPIDVW- - -SRGSPATLNCGAKP 55 

I; M I ■! :l : I II I III : I ::|| h 
Db 3 FFISHLVTLCFIFCVADSIHFSEEGN-KGYGPVFEEQPIDTIYPEESSDGQVSMNCRAR- 60 

Qy 56 STAKITWYKDGQPVITNKEQVNSHRIVL—DTGSLFLLKVNSGKNGKDSDAGAYYCVASN 113 

'III ::h MM::: Mill II II 
Db 61 AVPFPTYKWKLNNWDIDLTKDRYSMVGGRLVISNPEKSRDAGKYVCWSN 110 

Qy 114 EHGEVKSNEGSLKLAML REDFRVRPRTVQALGGEMAVLECSPPRGFPEPV-VSWR 1:7 

I hhl : I I : h I I III I II :h : I 
Db 111 IFGTVRSSEATLSFGYLDPFPPEEHYEVKVRE GVGAVLLCEPPYHYPDDLSYRWL 165 

Qy 168 KDDKELRIQ-DMPRYTLHSDGNLIIDPVDRSDSGTYQCVANNMVGERVSNPA-RLSVFEK 225 

:: : I I |: ::lll I h II I I I Ihh III I 

Db 166 LNEFPVFIALDRRRFVSQTHGNLYIANVEASDKGNYSCF VSSPSITKSVFSK 217 

Qy 226 PKFEQE — PKDMTVD VGAAVLFDCRVTGDPQPQITWKRKNEPMPVT ■ 269 

h ::■ Ihl :| I :| hi |:: I : llll I 

Db 218 FIPLIPQADRAKVYPADIKVKFKDTYALLGQNVTLECFALGNPVPELRWSKYLEPMPATA 277 

Qy 270 RAY I AKD— 276 

': II: II 

Db 278 EISMSGAVLKIFNIQYEDEGLYECEAENYRGKDKHQARVYVQASPEWVEHINDTEKDIGS 337 

Qy 277 : NRG-LRIERVQPSDEGEYVCYARNPAGTLEASAHL 310 

:| III: : I I I II II : hi I 
Db 338 DLYWPCVATGXPIPTIRWLKNGVSFRKGELRIQGLTFEDAGMYQCIAENAHGIIYANAEL 397 

Qy 311 RVQA- PPSFQTKPADQSVPA- -GGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADG 367 

:: I Ihh: I : : I II II I I : III I : |: I 
Db 398 KIVASPPTFELNPMKKKILAAKGGRVIIECKPKAAPKPKFSWSK-GTELLVNGS 450 

Qy 368 RTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQKKKSKMGKQ 427 

I : hi- I ::IH II h I : I h I h 
Db 451 RIHIWDDGSLEIINVTKLDEGRY1CFAENNRGKANSTGVLEMTEA1R 497 

Qy 428 KQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGIS-- 485 

I I : II :| : Ml II :: 
Db 498 ■ ITLAPLNVDVTVGENATMQCIASHDPTLDLTFI 530 

Qy 485 WLRDGLPIDIT DSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLT 539 

I :| II ' : : I II I ::: I III |: h II I 
Db 531 WSLNGFVIDFEKEHEHYERNVMIKSNGELLIKNVQLRHAGRYTCTAQTIVDNSSASADLV 590 

Qy 540 VEDHTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYS 599 

III ' I I I MM I h I lh I II 
Db 591 VRGP PGPPGGIRIEEIRDTAVALTWSR-GTDNHSPISKYTIQ-SK 533 

Qy 600 PDLGQTWFNIPDYVASTE YRIKGLKPSHSYMFVIRAENEKGIGTPSVSSA 649 

IMM III hll II II I I III: I 

Db 634 TFLSEEWKD-;---AKTEPSDIEGNMESARVlDLIPWMEYEFRIIATNTLGTGEPSMPSQ 688 

Qy 650 LVTTSKPAAQVALSD KNKMDMAIAEK RL 677 
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Mill! 1:1 I: 

689 RIRTEGAPPNVAPSDVGGGGGSNRELTITWMPLSREYHYGNNFGYIVAFRPFGEREWRRV 748 



678 T- 

I 

749 



■-SEQLIRLE-- 



TVTNPEIGRYVHKDESMPPSTQYQVKVKAFNi KGDGPFSLTAVIYSAQDAPTEVPTDVSV 



689 KTINSTAVRLFWKKRKLEELIDGYYIK' 
I ::|: : : I I: ::|| |: I 
809 KVLSSSEISVSW-HHVTEKSVEGYQIRYWAAtjDKEAMQRVQV- 



■WRGI PRTNDNQYVNVTSPSTEKY--WSNLMP 745 
II I I : I HI I 

SNQBYSTKLENLKP 864 



5 FT NYEFFVI PYHSGVHS IHGAPSNSMDVLTW APPSLPPEDVRIRMLNLTTLR- 



Oy 746 

II I ::| : :| II ::|::| 
Db 865 NTRYHIDVSAFNS • - - AG YGPPSRTIDI ITR^PPSQRP • 



Oy 800 SWKAPKADGI NG I lkgfqivivgqapnnnrn: TTNERAASVTLFHLVTGMTYKIRVAARS 859 

:| II ::|:::: : : :| : | : : I : I I 
Db 915 TWDHVKAMSNESAVEGYKVLYRPDGQHEGKLIJSTGKHTIEVP- - -VPSDGEYWEVRAHN 971 

Oy 860 NGGVG 864 
II I 
972 EGGDG 976 



Search completed: January 22, 2001, 12:26:10 
Job time: 2047 sec 



I 799 

III I II I 
'— RI— -ISSVRSGSRYII 914 
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Title: 

Perfect score: 
Sequence: 

, Scoring table: 



■arched: 





34 


425.5 


6.2 


3707 


PGBMJ10USE 


005793 nius musculu 


GenCore version 4.5 


35 


423.5 


6.2, 


837 


NCM2JOMAN 


015394 homo sapien 


Copyright (c) 1993 ■ 2000 Compugen Ltd. 


36 


422 


6.2 


811 


FS22J5ROME 


P34 083 drosophila 




37 


422 


6.2' 


4393 


PGBMJUMAN 


P98160 homo sapien 




38 


409.5 


6.0 


837 


NCM2JOUSE 


035136 mus musculu 


protein search, using sw model 


39 


407 


5.9 


873 


FS21_DROME 


P34082 drosophila 




40 


398 


5.8 


1271 


MYPC.CHICK 


090688 gallus gall 


January 22, 2001, 12:27:59 ; Search time 162,41 Seconds 


41 


396 


5.8 


1913 


KMLSJUMAN 


Q15746 homo sapien 


(without alignments) 


42 


393 


5.7 


898 


L FAS2_SCHAM 


P22648 schistocerc 


257.899 Million cell updates/sec 


43 


392.5 


5.7 


2481 


L ON52.CAEEL 


Q06561 caenorhabdi 




44 


381.5 


5.6 


1906 


L KMLS.CHICK 


P11799 gallus gall 


US-09-540-245A-17 

5. GQCn 


45 


370 


5.4 


1131 


L MYPF.CHICK 


P16419 gallus gall 



1 MYYLGFYHTHTHTHTYINFD T AQRFRS I PRNNG IVTQEQT 1297 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 

88757 seqs, 32294092 residues 

Total number of hits satisfying chosen parameters 88757 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
, Maximum Match 100% 

Listing first 45 summaries 

Database : SwissProtJ9:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



4 

Query 



NO. 


Score 


Match Length D 


3 ID 


Description 


CC 
CC 
CC 


1 


674 


9.8 1493 


NE01J0USE 


P97798 mus musculu 


2 


642 


9.4 1377 


NE01JAT 


P97603 rattus norv 


CC 


3 


639.5 


9.3 1461 


NEO1J0MAN 


Q92859 homo sapien 


CC 


4 


632.5 


9.2 1443 


NE01.CHICK 


Q90610 gallus gall 


CC 




598 


8.7 2012 


DSCAJOMAN 


060469 homo sapien 1 


CC 




597 


8.7 1010 


CONT..CHICK 


P14781 gallus gall 


CC 




595 


8.7 1020 


CONTJOUSE 


P12960 mus musculu 


• CC 




593 


8.6 1018 


CONTJUMAN 


012860 homo sapien 


CC 


9 


585 


8.5 1040 


AXOLRAT 


P22063 rattus norv 


CC 


10 


583.5 


8.5 1257 


CAMLJUMAN 


P32004 homo sapien 


CC 


11 


582.5 


8,5 1040 


AX01JUMAN 


Q02246 homo sapien 


CC 


12 


567.5 


8.3 1260 


CAMLJiOUSE 


P11627 mus musculu 


CC 


13 


562.5 


8.2 1036 


AXOl.CHICK 


P28685 gallus gall 


CC 


14 


561 


8.2 1284 


NRCA_CHICK 


P35331 gallus gall 


CC 


15 


557 


8,1 1259 


CAMLJAT 


Q05695 rattus norv 


CC 


16 


554 


8.1 1239 


NRG.DROME 


P20241 drosophila 


CC 


17 


554 


8.1 1447 


DCCJUMAN 


P43146 homo sapien 


CC 


18 


552 


8.0 1447 


DCCJOUSE 


P70211 mus musculu 


CC 


19 


526 


7.7 1897 


PTPFJUMAN 


P10586 homo sapien 


CC 


20 


509.5 


7.4 1912 


PTPDJOMAN 


P23468 homo sapien 


CC 


21 


467 


6.8 2029 


LARJ3ROME 


P16621 drosophila 


CC 


22 


462.5 


6.7 1092 


NCA2JENLA 


P36335 xenopus lae 


CC 


23 


459.5 


6.7 1088 


NCAljXENLA 


P16170 xenopus lae 


CC 


24 


455.5 


6.6 1091 


NCAl.CHICK 


P13590 gallus gall 


CC 


25 


455 


6.6 1266 


NGCA.CHICK 


Q03696 gallus gall 


CC 


26 


441.5 


6.4 1070 


PTK7JOMAN 


Q13308 homo sapien 


CC 


27 


435.5 


6.3 1115 


NCA1JOCSE 


P13595 mus musculu 


DR 


28 


433.5 


6.3 853 


NCA1JOVIN 


P31836 bos taurus 


DR 


29 


433 


6.3 858 


NCA1JAT 


P13596 rattus norv 


DR 


30 


432 


6.3 848 


NCA1J0MAN 


P13591 homo sapien 


DR 


31 


432 


6.3 1051 


PTK7_CHICK 


Q91048 gallus gall 


DR 


32 


429 


6.3 761 


NCA2JUMAN 


P13592 homo sapien 


DR 


33 


427.5 


6.2 725 


NCA2J0USE 


P13594 mus musculu 


DR 



STANDARD; PRT; 1493 AA. 
P97798; 

01-OCT-2000 (tel. 40, Created) 
01-OCT-2000 (tel. 40, Last sequence update) 
01-OCT-2000 (Rel, 40, Last annotation update) 



NEOl OR NGN. ' 
Mus musculus (Mouse). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrate; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
[1] 

SEQUENCE FROM N.A., AND ALTERNATIVE SPLICING. 
TISSUE-BRAIN; 

MEDLINE-97407661; PubMed-9264410; 
Keeling S.L., Gad J.M., Cooper H.M.; 

"Mouse neogenin, a DCC-like molecule, has four splice variants and is 
expressed widely in the adult mouse and during embryogenesis , "; 
Oncogene 15:691-700(1997). 

"I- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES, 

■I- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

-i- ALTERNATIVE PRODUCTS: AT LEAST 5 ISOFORMS; 1 (SHOWN HERE), 2, 3, 4 
AND 5; ARE PRODUCED BY ALTERNATIVE SPLICING. THE EXPRESSION OF 
ISOFORMS 3, 4 AND 5 ARE DEVELOPMENT ALLY REGULATED. 

-!- TISSUE SPECIFICITY: WIDELY EXPRESSED. 

•!- DEVELOPMENTAL STAGE: EXPRESSED UBIQUITOUSLY THROUGHOUT THE MID TO 
LATE STAGES' OF GESTATION AND IN ADULT TISSUES. STRONG EXPRESSION 
IS OBSERVED' IN THE VENTRAL REGION OF THE VENTRICULAR ZONE OF THE 
E15.5 MOUSE' NEURAL TUBE, AS WELL AS IN THE VENTRICULAR ZONES OF 
THE MESENCEPHALON AND RHOMBENCEPHALON. ISOFORMS 3 AND 4 ARE 
EXPRESSED AT HIGHER LEVEL COMPARED TO OTHER ISOFORMS BETWEEN Ell. 5 
AND E16.5, 

■I- SIMILARITY: 1 CONTAINS 4 IMMUNOGLOBULIN- LIKE C2-TYPE DOMAINS. 
-!■ SIMILARITY:' CONTAINS 6 FIBRONECTIN TYPE IIMIKE DOMAINS. 
-I- SIMILARITY:' BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY, STRONG, TO 
TUMOR SUPPRESSOR PROTEIN DCC. 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib,ch), 

EM3L; Y09535; CAA70727.1; -. 
HSSP; P02751; 1TTG. 
MGD; MGI: 1097159; NEOl. 
INTERPRO; IPR001777; ■. 
INTERPRQ; IPR003006; -. 
PFAM; PF00041; fn3; 6, 
PFAM; PF00047; ig; 4. 
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DR 


PRINTS; PR00014; FNTYPEIII. 




RW 


Transmembrane; Immunoglobulin domain; Glycoprotein; Signal; 


KW 


Alternative splicing. 




FT 


SIGNAL 


1 


36 


POTENTIAL, 


FT 


CHAIN 


37 


1493 


NEOGENIN. 


FT 


DOMAIN 


37 


1136 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1137 


1157 


POTENTIAL. 


FT 


DOMAIN 


1158 


1493 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


78 


147 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


177 


239 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


274 


338 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


366 


428 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


467 


564 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


567 


660 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


661 


760 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


766 


860 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


881 


981 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


982 


1083 


FIBRONECTIN TYPE-III, 


FT 


DOMAIN 


1149 


1153 


POLY-VAL. 


FT 


CARBOHYD 


84 


84 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


JT 


CARBOHYD 


221 


221 


N-LINKED (GLCNAC. . ,) (POTENTIAL) . 




CARBOHYD 


337 


337 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




CARBOHYD 1 


501 


501 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


520 


520 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


670 


670 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


746 


746 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


940 


940 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


VARSPLIC 


442 


461 


MISSING (IN ISOFORM 2), 


FT 


VARSPLIC 


863 


878 


MISSING (IN ISOFORM 3). 


FT 


VARSPLIC 


1086 


1096 


MISSING (IN ISOFORM 4). 


FT 


VARfPLIC 


1279 


1331 


MISSING (IN ISOFORM 5). 


SQ 


SEQUENCE 


1493 AA; 163159 MW; 441DE919D5E17C0E CRC64; 



Query Match 9.8%; Score 674; DB 1; Length 1493; 

Best Local Similarity 21.8%; Pred. No. 1.8e-29; 

Matches 334; Conservative 193; Mismatches 551; Indels 452; Gaps 64; 

Qy , 36 PIDVWSRGSPATLNCGA--KPSTAKITWYKDGQPVITNKEQVNSHRIVLDTGSLFLLKV 93 

1:1 : III III I :ll I I II I I : I :| llll: I 
Db 70 PVDTLSVRGSSV1LNCSAYSEPS-PNIEWKKDG--TFLNLES-DDRRQLLPDGSLFISNV 125 

Qy 94 NSGKNGKDSDAGAYYCVASNEH-GEVKSNEGSLKLAMLREDFRVRPRTVQALGGEMAVLE 152 

hi I I I III: :: I : I I :| I I :| I hi 
Db 126 VHSKHNK-PDEGFYQCVATVDNLGTIVSRTAKLTVAGLPR-FTSQPEPSSVYVGNSAILN 183 

Qy 153 CSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANNMVGE 212 

I I I I :: : I : I I I M II hi: : 

Db 184 CEVNADL-VPFVRWEQNRQPLLLDD--RIVKLPSGTLVISNATEGDGGLYRCIVESGGPP 240 

ft I 

m 213 RVSNPARLSVFEKPK FEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPM 266 

W : I: I I I : I: III I : : I I : I I I : I : I : 

Db 241 KFSDEAELKVIiQDPEEIVDLWLMRPSSMMKVTGQSAVLPCWSGLPAPWiMNEEVL 300 

Qy 267 PVTRA-YIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPAD 324 

: : III MUM 1:11 II II II I :|h 
Db 301 DTESSGRLVLLAGGCLEISDVTEDDAGTYFCIADNGNKTVEAQAELTVQVPPGFLKQPAN 360 

Qy 325 QSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVRQ 384 

III : hhl III I:: II I: I : : : 

Db 361 IYAHESMDIVFECEVTGKPTPTVKWVKNG- -DWIPS DNFKIVKEHNLQVLGLVK 413 

Qy 385 VDEGAYVCAGMNSAGSSLSKAAL KATFETKGRVQKKKSKMGK 426 

III I I II:: : I I III : 

Db 414 SDEGFYQCIAENDVGNAQAGAQLIILEHDVAIPTLPPTSLTSAT - -TDHLAPATTGPLPS 471 

Qy 427 QKQKNVQS I IKYLISAVTGNTP AKPP PTIEHGHQN — 461 

: I I:: :| III I I : I 

Db 472 APRDWASLVSTRFIKLTWRTPASDPHGDETYSVFYTKEGVDRERVENTSQPGEMQVTI 531 

Qy 462 QTLMVGSSAILPCQASGK PTPGI-SWLRDGLPIDIT" 496 

111:111 III:: I :| 

Db 532 QNLMPATVYIFKVMAQNKHGSGESSAPLRVETQPEVQLPGPAPNIRAYATSPTSITVTWE 591 



Qy 497 DSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGE 531 

I :| I I I III : :| h I 
Db 592 TPLSGNGEIQNYKLYYMEKGTDKEQDIDVSSH- - -SYTINGLKKYTEYSFRWAYNKHGP 643 

Qy 532 STWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTE-VELHWNAP-STSGAGP 589 

' : I I : |: II: I : : I ::: : :|| I ||: I 
Db 649 GV STQDVAVRTLSDVPSAAPQNLSLEVRNSKSIVIHWQPPSSTTQNGQ 696 

Qy 590 ITGYIIQY-- 1 - 597 

llll 1:1 . 

Db 697 ITGYKIRYRKASRRSDVTETLVTGTQLSQLIEGLDRGTEYNFRVAALTVNGTGPATDWLS 756 

Qy 598 — YSPDLGQTWFNIPDYVASTE 617 

: II :| :h :| 

Db 757 AETFESDLDET - -RVPEVPSSLHVRPLVTSIWSWTPPENQNIWRGYAIGYGIGSPHAQ 814 

Qy 618 YRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNK 667 

; l I: III I: ::l I III II: ::| :| :: 
Db 815 TIKVDYKQRYYTIENLDPSSHYVITLKAFNNVGEGIPLYESAV---TRPH TDTSE 866 

Qy 668 MDMAIAEKRLT SEQLIKL-EEVKTINSTAVRLFWKKRKL* - -EELIDG* • YYIKW 716 

:|: : i :::::: :|: | | ::: | I ::| 
Db 867 VDLFVINAPYTPVPDPTPMMPPVGVQASILSHDTIRITWADNSLPKHQKITDSRYYTVRW 926 

Qy 717 RGPPRTN- - -DNQYVNVTSPSTENYWSNLMPFTNYEFFVI - - -PYHSGVHSI - -HGAPS 768 

ill .: :| I : :| :|:|: I I I III |: I I: III 
Db 927 — -KTNIPANTKYKNANA-TTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGA- 979 

Qy 769 NSMDVLTAEAPPSLPPEDVRI--RMLNLTTLRISWKAPKADGINGILKGFQIVIVGQAPN 826 

I I I: 11:11 : : I: ::|: I II : h h I 
Db 980 TFELVPTSPPKDVT WSKEGKPRTI IVNWQPPSE - ■ ANGKITGY • 1 1 YYSTDVN 1030 

Qy 827 NNRNITTNERAASVTLFHLVTGMT — YKIRVAARSNGGVGVSHGTSEVIMNQDTLEKH 882 

: I- I I : :| I :: II:: |:| II : : I : 
Db 1031 AEIHDWVIEPVVGNRLTHQIQELTLDTPYYFKIQARNSKGMG- ■ -PMSEAVQFR-TPKAD 1086 

Qy 883 LAAQQENESFLYGLINKSHVP VIVIVAIL 911 

: : I: I I :| :|| I 

Db 1087 SSDKMPNDQALGSAGKGSRLPDLGSDYKPPMSGSNSPHGSPTSPLDSNMLLVIIVSVGVI 1146 

Qy 912 IIFWIIIAYCYWRNSRNSDGKDRSFIKINDGSVHMASN NLW DV 955 

I ll::ll I : : I I: I :ll I :|l : 
Db 1147 TIVWWIAWCTRRTTSHQKKKRAACKSVNGSHKYKGNCKDVKPPDLWIHHERLELKPI 1206 

Qy 956 AQNPNQNPMYNTAGRMTMNNRNGQALYSLTP--NAQDFFNNCDDYSGTMHRPGSEHHYHY 1013 

::|: II:. ', llll :|| I: I :|: : : I 

Db 1207 DKSPDPNPVMTD — TPIPRNSQ — DITPVDNSMD SNIHQRRNSYRGHE 1250 

Qy 1014 AQLTGGPGNAMSTFYGNQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPTNPVPPEP 1073 

:: ::||| I : : II: II : Ihl 

Db 1251 SE DSMSTLAGRR GMRPKMM — MPFDSQPPQP 1279 



Qy 1074 PARYADHTAGRRSRSSRASDGRGTL NGGLHHRTSGSQRSDSPPH 1117 

II: II II I I ; | | : :: : | 

Db 1280 VISAHPIHSLDNPHHHFHSSSLASPARSHLYHPSSPWPIGTSMSLSDRANSTESVRNTPS 1339 

Qy 1118 TDVSYVQ : — LHSSDGTGSSKERTGERRTP PNKTLMDFIPPPPS 1158 

II II Ihl : I I: I I I 

Db 1340 TDTMPASSSQTCCTDHQDPEGATSSSYLASSQEEDSGQSLPTAHVRPSHPLKSFAVPA- 1397 

Qy 1159 NPPPPGGHVYDTATRRQLNRGSTP REDTYDSVSDGAFARVDVNARPTSRNRNL 1211 

llll :ll I III I: II : : :ll 
Db 1398 -IPPPGPPLYDPAL PSTPLLSQQALEPSTFHSVKTAS IGTLG - RSRP P 1443 

Qy 1212 GGRPLKGKRDDDSQRSSLMMDDDGGSSEAD 1241 

I: : I :: |::| I I I 
Db 1444 "MPVWPSAPEVQETTRMLEDSESSYEPD 1471 



NE01.RAT 
ID NE01.RA 



k 



* Best Available Copy 
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f 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 



P97603; 

01-OCT-2QQ0 (Hel. 40, Created) 
01-OCT-2000 (Rel. 40, Last sequence update) 
01-OCT-2000 (Rel. 40, Last annotation update) 
NEOGENIN PRECURSOR (FRAGMENT) . 
NEOl OR NGN. 

Rattus norvegicus (Rat). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 
[1] 

SEQUENCE FROM N.A. 
TISSUE=BRAIN; 

MEDIINE-97015074; PubMed-8861902; 

Keino-Masu K. , Masu M., Hinck L., Leonardo E.D., Chan S.S.-Y., 

Culotti J.G., Tessier-Lavigne M.; 

"Deleted in Colorectal Cancer (DCC) encodes a netrin receptor."; 
Cell 87:175-185(1996). 

■!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

-!- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

■!- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIKE C2 -TYPE DOMAINS. 

•!- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 

•!• SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
TUMOR SUPPRESSOR PROTEIN DCC. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinforraatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; U68726; AAB41100.1; -. 
HSSP; P56276; 1TLK. 
INTERPR0; IPR001777; -. 
INTERPR0; IPROO3O06; -. 
PFAM; PF00041; fn3; 6. 
PFAM; PF00047; ig; 4. 

Transmembrane; Immunoglobulin domain; Glycoprotein; Signal. 
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j IG-LIKE C2-TYPE DOMAIN, 
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296 


IG-LIKE C2-TYPE DOMAIN. 
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386 


IG-LIKE C2-TYPE DOMAIN. 
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DOMAIN 
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FIBRONECTIN TYPE-III. 
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DOMAIN 


704 


798 


FIBRONECTIN TYPE-III. 
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DOMAIN 


819 


919 


FIBRONECTIN TYPE-III. 
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DOMAIN 


920 


1021 


FIBRONECTIN TYPE-III. 
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POLY-VAL. 
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CARB0HYD 


42 


42 


N-LINKED (GLCNAC. . .) (POTENTIAL). 
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CARBOHYD 


179 


179 


N-LINKED (GLCNAC. . .) (POTENTIAL). 
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CARBOHYD 


295 


295 


N-LINKED (GLCNAC. . .) (POTENTIAL). 
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CARBOHYD 


439 


439 


N-LINKED (GLCNAC. . .) (POTENTIAL). 
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CARBOHYD 


458 


458 


N-LINKED (GLCNAC. . .) (POTENTIAL). 
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CARBOHYD 


608 


608 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 
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CARBOHYD 


684 


684 


N-LINKED (GLCNAC. . .) (POTENTIAL). 
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CARBOHYD 


878 


878 


, N-LINKED (GLCNAC. . .) (POTENTIAL). 


SQ 


SEQUENCE 


1377 AA; 150637 MW; E514ED8ABD1A63A9 CRC64; 



Query Match 9.4%; Score 642; DB 1; Length 1377; 

Best Local Similarity 22. 0%; Pred. no. 9,4e-28; 

325; Conservative 194; Mismatches 535; Indels 426; 



Qy 36 PIDVWSRGSPATLNCGA-KPSTAKITWYKDGQPVITNKEQV-NSHRIVLDTGSLFLLK 92 

hi : III III I :|| I I III I I : I :| lll|: 
Db 28 PVDTLSVRGSSVILNCSAYSEPS-PNIEWKKDG — TFLNLVSDDRRQLLPDGSLFISN 82 

Qy 93 VNSGKNGKDSDAGAYYCVASNEH-GEVKSNEGSLKLAMLREDFRVRPRTVQALGGEMAVL 151 

I 1:1111 III: :: I : I I :| I I :| I :| 
Db 83 WHSKHNK-PDEGFYQCVATVDNLGTIVSRTAKLAVAGLPR-FTSQPEPSSIYVGNSGIL 140 

Qy 152 ECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANNMVG 211 

I I I I :: : I : I I 'I |:| II |:|: : 

Db 141 NCEVNADL-VPFVRWEQNRQPLLLDD--RIVKLPSGTLVISNATEGDEGLYRCIVESGGP 197 

Qy 212 ERVSNPARLSVFEKPK FEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEP 265 

: I: I I I :: : I II : I : : I : I I I I I 1 1 1 
Db 198 PKFSDEAELKVLQESEEMLDLVFLMRPSSMIKVIGQSAVLPCVASGLPAPVIRW-MKNED 256 

Qy 266 MPVTR AYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQT 320 

: I I :l I I I I I I II I hll II II II I 
Db 257 VLDTESSGRLALLA--GGSLEISDVTEDDAGTYFCVADNGNKTIEAQAELTVQVPPEFLK 314 

Qy 321 KPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIE 380 

:lh ' III : |:|:| I I I |:: II h I : 

Db 315 QPANIYARESMDIVFECEVTGKPAPTVKWVKNG-DWIPS DYFKIVKEHNLQVL 367 

Qy 381 EVRQVDEGAYVCAGMNSAGSSLSKA- ■ -ALKATFETKGRVQKKKSKMGKQKQKNVQSIIK 437 

: : III I I I I" : I I: II: : II:: 
Db 368 GLVKSDEGFYQCIAENDVGNAQAGAQLIILEHAPATTGPLPSAPRDV VASLVS 420 

Qy 438 YLISAVTGNTPAKPP PTIEHGHQN- - -QTLMVGSSAIL 472 

:| III I 1:1 I II : I 

Db 421 TRFIKLTWRTPASDPHGDNLTYSVFYTKEGVARERVENTSQPGEMQVTIQNLMPATVYIF 480 

Qy 473 PCQASGK PTPGI-SWLRDGLPIDIT 49? 

II' III:: 1:1 

Db 481 KVMAQNKHGSGESSAPLRVETQPEVQLPGPAPNIRAYATSPTSITVTWETPLSGNGEIQN 540 

Qy 497 DSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVED 542 

I :| I I I III : :| I: I 
Db 541 YKLYYMEKGTDKEQDVDVSSH— SYTINGLKKYTEYSFRWAYNKHGPGV 558 

Qy 543 HTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTE-VELHWNAPSTSGA-GPITGYIIQY-" 597 

: II : I: II: I : : I ::: : :|| II:: I Mil |:| 
Db 589 — STQDVAVRTLSDVPSAAPQNLSLEVRNSKSIVIHWQPPSSATQNGQITGYKIRYRKA 645 

Qy 598 YSPDLGQT 605 

: II :: 

Db 646 SRKSDVTETWTGTQLSQLIEGLDRGTEYNFRVAALTVNGTGPATDWLSAETFESDLDES 705 

Qy 606 WFNIPDYVASTE Y 618 

:h :T: I 
Db 706 --RVPEVPSSLHVRPLVTSIWSWTPPENQNIWRGYAIGYGIGSPHAQTIKVDYKQRYY 763 

Qy 619 RIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLT 678 

I: III h ::l I III II: ::| :l :::|: : I 
Db 764 TIENLDPSSHYVITLKAFNNVGEGIPLYESAV- • -TRPH TDTSEVDLFVINAPYT 815 

Qy 679 SEQLIKL-EEVKTINSTAVRLFWKKRKL— EELIDG-YYIKWRGPPRTN--D 724 

;::::: :|; I I ::: | I ::| :|| : 
Db 816 PVPDPTPMMPPVGVQASILSHDTIRITWADNSLPKHQKITDSRYYTVRW— -KTNIPAN 871 

Qy 725 NQYVNVTSPSIENYWSNLMPFTNYEFFVI - - -PYHSGVHSI - -HGAPSNSMDVLTAEAP 779 

:| I : if :|:|: I I I III |: I |: III I I 

Db 872 TKYKNANA-TILSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGA TFELV 922 

Qy 780 PSLPPEDVRI - -RMLNLTTLRISWKAPKADGINGILKGFQIVIVGQAPNNNRNITTNERA 837 

I: 11:11 : : 1 = : = h I II : h h I : I 
Db 923 PTSPPKDVTVVSKEGKPRTIIVNWQPPSE--ANGKITGY-IIYYSTDVNAEIHDWVIEPV 979 

Qy 838 ASVTLFHLVTGMT— -YKIRVAARSNGGVG-VSHGTSEVIMNQDTLEKHLAAQQENESF 892 

I I : ':| I :: II:: |:| :| I: :| I : 

Db 980 VGNRLTHQIQELTLDTPYYFKIQARNSKGMGPMSEAVQFRTPKADSSDKMPNDQALGSAG 1039 



Mon Jan 22 13:04:35 2001 
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Qy •893 LYGLI NKSH VPYIVIVAILIIFWIIIAYCYWRN 926 

I : « I I : :|| : :: I 1 1 : 1 1 1 I 

Db 1040 RGGRLPDLGSDYKPPMSGSNSPHGSPTSPLDSNMLLVIIVSIGVITIWWIIAVFCTRR 1099 

Qy 927 SRNSDGKDRSFIKINDGSVHMASN NLW DVAQNPNQNPMYNTAGR 970 

: : I I: I :|| I :|l : ::|: II: 

Db 1100 TTSHQKKKRAACKSVNGSHKYKGNCKDVKPPDLWIHHERLELKPIDKSPDPNPVMTD— 1156 

Qy 971 MTMNNRNGQALYSLTP--NAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTGGPGNAMSTFY 1028 

I II I :M I: I :|: : : I :: ::||| 
Db 1157 -TPIPRNSQ- ■ -DITPVDNSMD SNIHQRRNSYRGHESE DSMSTLA 1197 

Qy 1029 GNQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPTNPVPPEPPARYADHTAGRRSRS 1088 

I : : II: II : II: I I : I 

Db 1198 GRR GMRPKMM— -MPFDSQPPQQSVRNTPSTDTMPASS 1232 

| Qy 1089 SRASDGRGTLNGGLHHRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSKERTGERRTPPNKT 1148 

I I I II: 

f Db 1233 S QTCCTDHQDPEGATSSSYLASSQEEDSGQSLPTAHVR— PSHP 1274 

Oy 1149 LMDFIPPPPSNPPPPGGHVYDTA TRRQLNRGST PREDTYDSVSDGAFARVDVN 1201 

J| III illl :|| I ::: II : I I I 

H 1275 LKSFAVPA — IPPPGPPIYDPALPSTPLLSQQALNH — HLHSVKTASIGTLGR — 1323 

Qy 1202 ARPTSRNRNLGGRPLKGKRDDDSQRSSLMMDDDGGSSEAD 1241 

:|| I: : I :: |::| I I I 

Db 1324 SRPP- ----- -MPVWPSAPEVQEATRMLEDSESSYEPD 1355 



RESULT 3 
NEOIJUMAN 

ID NEOl HUMAN STANDARD; PRT; 1461 AA. 
Q92859; 000340; 

01-OCT-2000 (Rel. 40, Created) 
01-OCT-2000 jRel. 40, Last sequence update) 
01-OCT-2000 jRel. 40, Last annotation update) 

NEOGENIN PRECURSOR. 
NEOl OR NGN.i 
Homo sapiens; (Human) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

ml ! 

SEQUENCE FROM N.A. (ISOFORMS 1 AND 2). 
TISSUE-FETAL [BRAIN; 
RX MEDLINE=97236653; PubMed-9121761; 

RA Meyerhardt jIa., Look A.T., Bigner S.H., Fearon E.R.; 

RT "Identification and characterization of neogenln, a DCC-related 

RT gene."; 

RL Oncogene 14:1129-1136(1997). 



RC 



[2] 

SEQUENCE FROM N.A. (ISOFORMS 1 AND 2). 
TISSUE-FETAL BRAIN; ■ 
MEDLINE-97312699; PubMed-9169140; 

Vielmetter J,, Chen X.-N., Miskevich F., Lane R.P., Yamakawa R., 
Rorenberg J.R., Dreyer W.J.; 

"Molecular characterization of human neogenin, a DCC-related protein, 
and the mapping of its gene (NEOl) to chromosomal position 15q22.3- 
q23."; 

Genomics 41:414-421(1997), 

-!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

-I- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

-!- ALTERNATIVE PRODUCTS: AT LEAST 2 ISOFORMS; 1 (SHOWN HERE) AND 2; 
ARE PRODUCED BY ALTERNATIVE SPLICING. 

-!- TISSUE SPECIFICITY: WIDELY EXPRESSED AND ALSO IN CANCER CELL 
LINES. 

-I- SIMILARITY; CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
-!- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 
•!• SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
TUMOR SUPPRESSOR PROTEIN DCC. 



CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 

DR EMBL; U61262; AAB17263.1; -. 

DR EMBL; 072391; AAC51287.1; -. 

DR MIM; 601907; -." 

DR HSSP; P02751; 1TTG. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 6. 

DR PFAM; PF00047; ig'; 4. 

DR PRINTS; PR00014; FNTYPEIII. 

KW Transmembrane; Immunoglobulin domain; Glycoprotein; Signal; 
Alternative splicing. 
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729 


FIBRONECTIN TYPE-III. 
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FIBRONECTIN TYPE-III. 
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270. 
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BY SIMILARITY. 
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BY SIMILARITY . 
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N-LINRED (GLCNAC. . .) 
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N-LINKED (GLCNAC. . .) 


(POTENTIAL). 
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326. 
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N-LINKED (GLCNAC. . .) 


(POTENTIAL). 
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CARBOHYD 


470 : 
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N-LINKED (GLCNAC. . ,) 


(POTENTIAL). 
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CARBOHYD 


489' 


• 489 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 
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CARBOHYD 
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639 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 
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CARBOHYD 
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N-LINKED (GLCNAC. . .) 


(POTENTIAL). 
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CARBOHYD 


909, 
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N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 
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VARSPLIC 
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MISSING (IN ISOFORM 2) 
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CONFLICT 


168' 


168 


G -> N (IN REF. 2). 




SQ 


SEQUENCE 


1461. AA; 159958 MW; 7AAE897E69635A21 CRC64; 



Query Match 9.3%; Score 639,5; DB 1; Length 1461; 

Best Local Similarity 21.64; Pred. No. 1.4e-27; 

Conservative 202; Mismatches 546; Indels 437; Gaps 65; 



1:1 : III III I :ll II I III I I : I :| Mil: 



Matches 


Qy 


36 


Db 


59 


Qy 


93 


Db 


114 


Qy 


152 


Db 


172 


Qy 


212 


Db 


229 


Qy 


266 



I |: I Ml III: 11:1 I :| I I :l 



MM 



I 1:1 



I I hll 



■ - FEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEP 265 

I ::| : :| |: I :| I II I : I 



266 MPV- -TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPA 323 
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: : : I I I I I I I I I Ml I I III II :| 
Db 289 LDTESSERLVLLAGGSLEISDVTEDDAGTYFCIADNGNETIEAQAELTVQAQPEFLKQPT 348 

Qy 324 DQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVR 383 

: III : I:|:"l I I I |:: II |: | : : 

Db 349 NIYAHESMDIVFECEVTGKPljPTVKWVKNG- -DMVIPS DYFKIVKEHNLQVLGLV 401 

Qy 384 QVDEGAYVCAGMNSAGSSLSkW KATFET 413 

: III I I I [:: : I I | |: | 

Db 402 KSDEGFYQCIAENDVGNAQAGAQLIILEHAPATTGPLPSAPRDWASLVSTRFIKLTWRT 461 

Qy 414 KG RVQKKRSKMGKQKQKN VQSIIKYLISAVT GN 446 

I I : ::: :| :| I: I: I |: 
Db 462 PASDPHGDNLTYSVFYTKEGIARERVENTSHPGEMQVTIQNLMPATVYIFRVMAQNKHGS 521 

Qy 447 TPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISW LRD 489 

: I :| I : : I : I I: : ::| : 
kDb 522 GESSAPLRVE--TQPEVQLPGPAPNLRAYAASPTSITVTWETPVSGNGEIQNYKLYYMEK 579 

FQy 490 GLPIDITDSRISQHSTGSLHIADLRKPDTGVYTCIMNEDGESTWSASLTVEDHISNAQF 549 

I : I :| I I I III : :| I: I : : I 

Db 580 GTDKE-QDVDVSSH- - -SYTINGLRKYTEYSFRWAYNKHGPGVSTPDVAVR 627 

Qy 550 VRHPDPSNFPSSPTQPIIVNVTDIE-VELHWN--APSTSGAGPITGYIIQY 597 

I: II: I : : I ::: : :ll Ihl I Mil |:| 
Db 628 TLSDVPSAAPQNLSLEVRNSRSIMIHWQPPAPATQN-GQITGYKIRYRKASRRSDV 682 

QY 598 : YSPDLGQTWFNIPD 611 

: II :| :|: 

Db 683 TETLVSGTQLSQLIEGLDRGTEYNFRVAALTINGTGPATDWLSAETFESDLDET-RVPE 740 

Qy 612 YVASTE YRIKGLK 624 

H i I I: I 

Db 741 VPSSLHVRPLVTSIWSWTPPENQNIWRGYAIGYGIGSPHAQTIKVDYKQRYYTIENLD 800 

Qy *625 PSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLT S 679 

II I: ::|«f III ||: ::| :| :::|: } | : 

Db 801 PSSHYVITLKAFNNVGEGIPLYESAV— TRPH TDTSEVDLFV|NAPYTPVPDPT 852 

Qy 680 EQLIKL-EEVKTINSTAVRLFWKKRKL— EELIDG--YYIKWRGPPRTN— DNQYVNV 730 
: : : :: :|: I I ::: I I ::| Jl : :| I 
853 PMMPPVGVQASILSHDTIRITWADNSLPKHQKITDSRYYTVRW — RTNIPANTKYKNA 908 



731 TSPSTENYWSNLMPFTNYEFFVI— PYHSGVHSI- -HGAPSNSMDWiTAEAPPSLPPE 785 

: :l :|:|: I I I III I: I I: II M I |: ||: 
909 NA ■ TTLSYLVTGLKPNT LYEFSVMVTKGRRSSTWSMT AHGT TFELVPTSPPK 959 

786 DVRI ■ ■RMLNLTTLRISWKAPKADGINGILKGFQIVIVGQAPNNNRNITTNERAASVTLF 843 

.11 : : I: ::|: I II : |: h I : I I 
960 DVTWSKEGKPKTIIVNWQPPSE- -ANGKITGY- IIYYSTDVNAEIHDWVIEPWGNRLT 1016 



; — 1 — 



844 HLVTGMT-— YKIRVAARSNGGVGVSHGTSEVIM— J — NQDTLEKHLAAQQENESF 892 

I : :| I II:: hi' II : : I : |: : 

1017 HQIQELTLDTPYYFRIQARNSKGMG" • • PMSEAVQFRTPKADSSDKMPNDQASGSGGKGS 1073 



Qy 893 LYGLINKSHVP VIVI^AILIIFWIIIAYCYWRNSRN 929 

: : I =11 I'. :: I Ihlll I : : 

Db 1074 RLPDLGSDYKPPMSGSNSPHGSPTSPIiDSNMLLVIIVSVGVITIWWIIAVFCTRRTTS 1133 

Qy 930 SDGKDRSF I K I NDGSVHMASN NLW DVAQNPNQNPMYNTAGRMTM 973 

I I: I :ll I :|| ; : ::|: ||: I 

Db 1134 HQKKKRAACKSVNGSHKYKGNSKDVKPPDLWIHHERLE|KPIDKSPDPNPIMTD — TP 1189 

Qy 974 NNRNGQALYSLTP--NAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTGGPGNAMSTFYGNQ 1031 

II I :|| I: I :|: : : I :: ::||| I : 
Db 1190 IPRNSQ — DITPVDNSMD SNIHQRRNSYRGHESE DSMSTLAGRR 1231 



Qy 1032 YHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPTNPVPPEP PARYADHTAGRRS 1086 

: II: II : Ihl I |: 
Db 1232 GMRPKMM — MPFDSQPPQPVISAHPIHSLDNPHHHFH 1266 



1087 RSSRASDGRGTL" 

II II II 



■ -NGGLHHRTSGSQRSDSPPHTDVSYVQ- - 

: I I : :: : I II 



Db 1267 SSSLASPARSHLYHPGSPWPIGTSMSLSDRANSTESVRNTPSTDTMPASSSQTCCTDHQD 1326 

Qy 1125 ---LHSSDGTGSSKERTGERRTP PNRTLMDFIPPPPSNPPPPGGHVYDTA i:71 

II ', Ihl : I I: I I I Mil ' II I 
Db 1327 PEGATSSSYLASSQEEDSGQSLPTAHVRPSHPLKSFAVPA— IPPPGPPTYDPALPSTP 1383 

Qy 1172 --TRRQLNRGSTPREDTYDSVSDGAFARVDVNARPTSRNRNLGGRPLKGKRDDDSQRSSL 1229 

::: 1 1 • ' : : I I I :|| |: : I :: 

Db 1384 LLSQQALNH— -HIHSVKTASIGTLGR— -SRPP MPWVPSAPEVQETTR 1427 



1230 MMDDDGGSSEAD 1241 

I::| I I. I 
1428 MLEDSESSYEPD 1439 



RESULT 4 : 
NE01.CHICK 

ID NEOl_CHICK ' STANDARD; PRT; 1443 AA. 
AC Q90610; 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rel. 40, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEOGENIN (FRAGMENT). 

OS Gallus gallus (Chicken). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Archosauria; Aves; Neognathae; Gallif ormes ; Phasianidae; Phasianinae; 
OC Gallus. 
RN [1] 

RP SEQUENCE FROM N.A. 

STRAIN-WHITE LEGHORN; TISSUE- EMBRYONIC BRAIN; 
MEDLINE-95105243; PubMed-7806578; 
Vielmetter J., Roman J.M., Dreyer W.J.; 
"Neogenin, an avian cell surface protein expressed during terminal 
neuronal differentiation, is closely related to the human tumor 
suppressor molecule deleted in colorectal cancer."; 
J. Cell Biol. 127:2009-2020(1994). 

•I- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 
-I- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN, 
■!■ DEVELOPMENTAL STAGE: IN RETINA, EXPRESSED ON GANGLION CELL FIBERS 

AS SOON AS THEY BEGIN TO EXTEND THEIR AXONS. 
■I- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIRE C2-TYPE DOMAINS, 
-I- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 
■!- SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
TUMOR SUPPRESSOR PROTEIN DCC. 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute, There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; U07644;'AAC59662.1; -. 
HSSP; P80362; *1WTL. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -, 
PFAM; PF00041; fn3; 6. 
PFAM; PFQ0Q47; ig; 4. 

Transmembrane; Immunoglobulin domain; Glycoprotein. 



NONJER 
DOMAIN 



DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 



1 
<1. 
1091 
1112 
33- 
132 
229 
321' 
422 
522' 



1 

1090 
1111 
1443 
102 
194 
293 
383 
519 
615 



EXTRACELLULAR (POTENTIAL). 
POTENTIAL. 

CYTOPLASMIC (POTENTIAL), 
IG-LIRE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIRE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
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FT 




616 


714 


IIHKUNLLIIN lift III, 




FT 




720 


814 


llBKUNfctllN llrb 111. 






IYMTN 


835 


935 


HBKUNLUiN llrti 111, 




FT 


DOMAIN 


936 


1037 


riDKWWLHiw iirii ill. 




FT 


DISULFID 


40 


95 


DI OlMlLnKllI, 




FT 


DISULFID 


139 


187 


RV QTVTT1PTTY 




FT 


DISULFID 


236 


286 


DI OlWlliHKllI. 




FT 




328 


376 


BV CTUTT1DTTV 
DI OlMlljAKIll, 




FT 


CARBOHYD 


39 


39 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


176 


176 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


292 


292 


N-LINKED (GLCNAC. . ,) 


(POTENTIAL) . 


FT 


CARBOHYD 


456 


456 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


475 


475 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


625 


625 


N-LINKED (GLCNAC. . .; 


(POTENTIAL) . 


FT 


CARBOHYD 


700 


700 


N-LINKED (GLCNAC, . . 


(POTENTIAL) . 


FT 


CARBOHYD 


894 


894 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


SQ 


SEQUENCE 


1443 AA 


; 158050 MW; 558C67 95579C0E26 CRC64; 



Query Match 9.2%; Score 632.5; DB 1; Length 1443; 

>Best Local Similarity 21.1%; Pred. No. 3.3e-27; 
Matches 323; Conservative 199; Mismatches 553; Indels 453; Gaps 64; 

Qy 36 PIDVWSRGSPATLNCGAKPST-AKITWYKDGQPVITNKEQVNSHRIVLDTGSLFLLKVN 94 

1:1:: II: : 1 1 : I 1 1 I 1 1 1 : I : I :| II! : | 
Db 25 PMDILSVRGASVIMNCSSYCETPPKIEWKKDG--TLLNLVS-DDRRQLLPDGSLLINSW 81 

Qy 95 SGKNGKDSDAGAYYCVASNEH-GEVKSNEGSLKLAMLREDFRVRPRTVQALGGEMAVLEC 153 

I: I III III: 11:1 I :| I I :| I |:| I 
Db 82 HSKHNK-PDEGYYQCVATVESLGSIVSRTAKLTVAGLPR-FTSQPELSSVYKGNSAILNC 139 

Qy 154 SPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANNMVGER 213 

I I I :l : I : I I MM :| I hll : : 
Db 140 EVNVDL - APFVRWEQDRQPLSLDD - - RVFKLPSGALLIGNATDTDGGFYRCVIESGGTPK 196 

Qy 214 VSNPARLSVFEKPK FEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMP 267 

III: I: I ::| :| I :| I I I I : I : I : 
Db 197 YSEEAELKILPDPEEPQSLVFVRQPSSLTKVTGQNAVFPCVAGGFPTPYVRWTKNGEELI 256 

Qy 268 V--TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQ 325 

■ III I I I I I I Ml I I II II I :lh 

Db 257 TEDSERFALRAGGSLLISDVTEEDVGTYTCIADNENETIEAQAELAVQVPPEFLKRPANI 316 

Qy .326 SVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVRQV 385 

III : IMI I I I h: || |: I : : : 

Db 317 YAHESMDIVFECSVTGKPTPTVKWVKNG- -DWIPS DYFKIVKEHNLQVLGLVKS 369 

Qy 386 DEGAYVCAGMNSAGSSLSKAAL KAT 410 

III I I I I:: : II II 
k Db 370 DEGFYQCIAENDVGNAQAGAQLIILDLDVAIPTLPPTSLTSATNDHLAPATTGPLPTAPR 429 

*Qy 411 FETKGRVQKKK — SKMGKQKQKNVQS 434 

III:::: MM :|: 
Db 430 DWATLVSTRFIRLTWRTPVSDPQGDNLTYSIFYTKEGINRERVENTSRPG-ETQVMIQN 488 

Qy 435 UK— YLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGL 491 

- . I: II :lll : : : I I I I I I 

Db 489 LMPETVYVFRWAQN KHGHGESSAPLKVATQPEVQLPG-PAPNIR-AYAGS 537 

Qy 492 PIDIT— DSRIS QHSTGSLHIADLKKPDTGV YTCIA 525 

I :| ■•: :| II : :| I II: : :| 

Db 538 PTSVTVTWETPLSGNGEIQNYKLYYMEKGQDSEQDVDVAGLSYTITGLKKYTEYSFRWA 597 

Qy 526 KNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTE-VELHWN-APS 583 

IM : I | : |: ||: | : : ::: : ||| |: 

Db 598 YNKHGPGV STQDVWRTLSDVPSAAPQNLTLEARNSKSIMLHWQPPPA 645 

Qy 584 TSGAGPITGYIIQY 597 

: :| Mil 1:1 

Db 646 GTHSGQITGYKIRYRKVSRKSDVTESVGGTQLFQLIEGLERGTEYNFRIAAMTVNGTGPA 705 

Qy 598 YSPDLGQTWFNIPDYVASTE 617 

: II :: :|: :| 



Db 706 TDWVSAETFESDLDES-RVPEVPSSLHVRPLVTSIWSWTPPENQNIWRGYAIGYGIG 763 



618 



YRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVAL 662 
. I I: I II I: ::| I I I I ||: M 
Db 764 SPHAQTIKVDYKQRYYTIENLDPSSHYVITLKAFNNVGEGIPLYESAV— TRPH 815 

Qy 663 SDKNKMDMAIAEKRLT SEQLIKL-EEVKTINSTAVRLFWKKRKL— EELIDG-- 711 

II :::|:M I I : : : :: :|: I I ::: I 
Db 816 SDTSEVDLFVINAPYTPVPDPSPMMPPVGVQASILSHDTIRITWADNSLPKNQKITDARY 875 

Qy 712 YYIKWRGPPRTN--DNQYVNVTSPSTENYWSNLMPFTNYEFFVI— PYHSGVHSI-H 764 

I M Ml | : : Ml: II MM: I: 
Db 876 YTVRW----KTNIPANTKYKTANATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAH 931 

Qy 765 GAPSNSMDVLTAEAPPSLPPEDVRI--RMLNLTTLRISWKAPKADGINGILKGFQIVIVG 822 

I Mil: ||:|| : : |: ::|: I || : |: |: 
Db 932 GT TFELVPTSPPKDVT WSKEGKPRT I IVNWQPPSE - - ANGKITGY - 1 IYYS 980 

Qy 823 QAPNNNRNITTNERAASVTLFHLVTGMT — YKIRVAARSNGGVG 864 

I : I I M : I :: ||:: |:| 
Db 981 TDVNAEIHDWVIEPWGNRLTHQIQELTLDTPYYFKIQARNSKGMGPMSEAVQFRTPKAE 1040 

Qy 865 ; - - - -VSHGTSEVIMNQDTLEKHLAAQQENESFLYGLINKSHVPVIVIVAIL 911 

I: I : I : : I I : : :|| I :: 
Db 1041 SSDKMPNDQASGSAGKGSRPVDVGPDYKPPLSGSNSPHGSPTSPLDSNMLLVIIVSVGVI 1100 

Qy 912 1 1 FW 1 1 IAYCYWRNSRNSDG KDRSFI K I NDGSVHMASN NLW DV 955 

I :|:|:| ■' I : : I I: I :|| I :|| : 
Db 1101 TIVIWIVAVFCTRRTTSHQKKKRAACKSVNGSHKYKGNSKDVKPPDLWIHHERLELKPI 1160 

Qy 956 AQNPNQNPMYHTAGRMTMNNRNGQALYSLTP--NAQDFFNNCDDYSGTMHRPGSEHHYHY 1013 

::|: I.I: ' I II I Ml IM :|: : : I 

Db 1161 DKSPDPNPIMTD TPIPRNSQ---DITPVDNSMD SNIHQRRNSYRGHE 1204 

Qy 1014 AQLTGGPGNAMSTFYGNQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPTNPVPPEP 1073 

:: •::IM I : : II: II : ||:| 

Db 1205 SE DSMSTLAGRR GMRPKMM — MPFDSQPPQP 1233 

Qy 1074 PARYADHTAGRRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTD 1119 

II: I II I III: I I M 

Db 1234 VISAHPIHSLDNPHHHFHSGSLASPTRSY— -LHHQVSPWPVGTSMSHSDRANSTESVR 1289 

Qy 1120 - VSYVQLHSSDGT- -GSSKERTGERRTP PNKTLMDFIPPP 1156 

: II I MM : | I: | | | 
Db 1290 NTPSSDTMPASSSQPCADHQDPDSSSGAYLGSAQEEDAAQSLPTAHVRPSHPLKSFAVP- 1348 

Qy 1157 PSNPPPPGGHVYD-TATRRQLNRGSTPREDTYD-SVSDGAFARVDVNARPTSRNRNLGG 1213 

Mil | | . : : | | | || 
Db 1349 "-AVPAAGSAYDPTLPSTPLLTQQAPSHPVHSVKTASIGTLGR---TRPP 1393 

Qy 1214 RPLKGKRDDDSQRSSLMMDDDGGSSEAD 1241 

I: I.I :: Ml I I I 
Db 1394 MPWVPSAPDVQETTRMLEDSESSYEPD 1421 



PRT; 2012 AA. 



RESULT 5 

DSCAJUMAN ' ' 

ID DSCAJUMAN STANDARD; 

AC 060469; 060468; • 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rel. 40, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE DOWN SYNDROME CELL ADHESION MOLECULE PRECURSOR (CHD2) , 

GN DSCAM, 

OS Homo sapiens ;( Human). 

OC Eukaryota; Metazoa; Chorda ta; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 

rn [l] 

RP SEQUENCE FROM N.A, , AND ALTERNATIVE SPLICING, 

RC TISSUE-BRAIN; 

RX MEDLINE-98087,574; PubMed-9426258; 

RA Yamakawa K., Huot Y.-K., Haendelt M.A,, Hubert R,, Chen X,-N., 

RA Lyons G.E., Korenberg J.R.; 
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: 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 



RT "DSC AM: a novel member of the immunoglobulin superfamily maps in a 

RT Down syndrome region and is involved in the development of the 

RT Aervous system,"; 

RL Hum. HOI. Genet. ^9:227-237(1998) . 

RN [2] 

RP SEQUENCE FROM H. A. . 

RX MEDLINE-20289799; PubMed-10830953; 

RA Hattori M., Fujiyama A. , Taylor T.D., Watanabe H. , Yada T., 

RA Park H.-S., Toyoda A., Ishfii K., Totoki Y., Choi D.-K., Soeda e., 

RA Ohki M., Takagi T,, SakaktY., Taudien S., Blechschmidt K., Polley A., 

RA Menzel U., Delabar J., Kumpf K., Lehmann R., Patterson D., 

RA Reichwald K, , Rump A., Schjillhabel M. , Schudy A., Zimmermann w,, 

RA Rosenthal A., Kudoh J., Shjibuya K., Kawasaki K., Asakawa S., 

RA Shintani A,, Sasaki T. ( Najjaraine K., Mitsuyama S., Antonarakis S.E., 

RA Minoshima s. ( shimizu N. , -Nordsiek G., Hornischer K., Brandt P., 

RA Scharfe M., Schoen 0., Desjbrio A., Reichelt J., Kauer G., Bloecker B., 

RA Ramser J., Beck A., Klages'S., Hennig S,, Riesselmann L., Dagand E., 

IWehrmeyer S., Borzym K., Gardiner K., Nizetic D., Francis F., 
Lehrach h., Reinhardt R. , fraspo M.-L.; 
"The DNA sequence of human} chromosome 21. \ 

RL nature 405:311-319(2000). j 

RN [3] 

RP FUNCTION. j 

RA Agarwala K.L., Nakamura Si Tsutsumi Y., Yamakawa K.; 

RT "Down syndrome cell adhesipn molecule DSCAM mediates homophilic 

RT intercellular adhesion,";] 

RL Brain Res. Mol. Brain Res:] 79:118-126(2000). 

CC -!- FUNCTION: CELL ADHESICW MOLECULE THAT CAN MEDIATE CATION- 

CC INDEPENDENT HOMOPHILIC] BINDING ACTIVITY. COULD BE INVOLVED IN 

CC NERVOUS SYSTEM DEVELOPMENT . 

CC -!- SUBCELLULAR LOCATION: ,'JTYPE I MEMBRANE PROTEIN (PROBABLE). THE 
CC SHORT ISOFORM MAY BE SECRETED. 

CC -!- ALTERNATIVE PRODUCTS: £ ISOFORMS; A LONG FORM/CHD2-52 (SHOWN HERE) 
CC AND A SHORT F0RM/CHD2-JI2; ARE PRODUCED BY ALTERNATIVE SPLICING. 

CC -!- TISSUE SPECIFICITY: PRIMARILY EXPRESSED IN BRAIN. 

CC -!- SIMILARITY: CONTAINS IMMUNOGLOBULIN -LIKE C2-TYPE DOMAINS. 

CC ■!- SIMILARITY: CONTAINS E FIBRONECTIN TYPE III-LIKE DOMAINS, 

CC | 

CC. This SWISS -PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation ■ 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; AF023450; AAC17967.1; -. 
EMBL; AF023449; AAC17966.1; -. 
EMBL; AL163283; CAB90464.1; -. 
EMBL; AL163282; CAB90436.1; -. 
EMBL; AL163281; CAB90444.1; -. 
MIM; 602523; -. 
INTERPRO; IPR001777; -. 
INTERPRO; IPRQ03006; -. 
PFAM; PF00041; fn3; 6. 
PFAM; PF00047; ig; 9. 
PRINTS; PR00014; FNTYPEIII. 

Immunoglobulin domain; Glycoprotein; Signal; Cell adhesion; Repeat; 
Transmembrane; Alternative splicing. 



FT 


SIGNAL 


1 


17 


POTENTIAL. 


Db 


494 — LYQARINVRGPASIRPMKNITAIAGRDTYIHCR-VIGYPYYSIKWYKNSNLLPFNH 


548 


FT 


CHAIN 


18 


2012 


DOWN SYNDROME CELL ADHESION MOLECULE. 








FT 


DOMAIN 


18 


1595 


EXTRACELLULAR (POTENTIAL). 


Qy 


178 MPRYTLHSDGNLIIDPVDRS-DSGTYQCVANNMVGERVSN PARLSVFEKP 


226 


FT 


TRANSMEM 


1596 


1616 


POTENTIAL. 




: ::| 1.: | : | | | | | :| ::| 1 : III 




FT 


DOMAIN 


1617 


2012 


CYTOPLASMIC (POTENTIAL). 


Db 


549 R-QVAFENNGTLKLSDVQKEVDEGEYTC--NVLVQPQLSTSQSVHVTVKVPPFIQPFEFP 


605 


FT 


DOMAIN 


39 


109 


IG-LIKE C2-TYPE DOMAIN. 








FT 


DOMAIN 


138 


204 


IG-LIKE C2-TYPE DOMAIN. 


Qy 


227 KFEQEPKDMTVDVGAAVLFDC-RVTGDPQPQITWKRKNEPMPVTRAYIAKDN- - - -RGLR 


281 


FT 


DOMAIN 


239 


300 


IG-LIKE C2-TYPE DOMAIN. 




:| • :| 1 1 Ml llh: hi : : II II 




FT 


DOMAIN 


328 


392 


IG-LIKE C2-TYPE DOMAIN. 


Db 


606, RF SIGQRVFI PCVWSGDLPIT ITWQKDGRPIPGSLG - VT I DNIDFTSSLR 


655 


FT 


DOMAIN 


421 


491 


IG-LIKE C2-TYPE DOMAIN. 








FT 


DOMAIN 


518 


582 


IG-LIKE C2-TYPE DOMAIN, 


Qy 


282 IERVQPSDEGSYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVPAGGTATFECTLVG 


341 


FT 


DOMAIN 


610 


676 


IG-LIKE C2-TYPE DOMAIN, 




1 : 1' 1 1 III 1 :| : 1 1= II 1 :| II 1 hi 




FT 


DOMAIN 


704 


773 


IG-LIKE C2-TYPE DOMAIN. 


Db 


656 ISNLSLMHNGNYTCI ARNEAAAVEHQSQLIVRVPPKFWQPRDQDG I YGKAVI LNCSAEG 


715 


FT 


DOMAIN 


802 


872 


IG-LIKE C2-TYPE DOMAIN. 









FT 


DOMAIN 


885 


■ 972 


FIBRONECTIN TYPE-III, 




FT 


DOMAIN 


984, 


1076 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


1088 


,1177 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


1189 


1273 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


1300: 


,1366 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


1380 


1463 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


1477 


1562 


FIBRONECTIN TYPE-III. 




FT 


DISULFID 


46 


102 


BY SIMILARITY, 




FT 




145' 


197 


BY SIMILARITY. 






LUiUljr 1U 


246 


ip<! 


BY SIMILARITY. 






UloUJjf ID 






BY SIMILARITY. 




FT 


UloUul IU 


428 


484 


BY SIMILARITY. 




FT 


nTcnr.PTn 


525 


575 


BY SIMILARITY. 






nTQriTPTn 


617 ' 


669 


BY SIMILARITY, 




: 

J 


nTonrPTn 
UloUbriU 


711 


766 


BY SIMILARITY. 






nTCnT.PTn 


809 


865 


BY SIMILARITY, 






DlSULl iU 






BY SIMILARITY. 




FT 


CARBOHYD 


28 


28 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL). 




LAKiJUniL) 






N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 




470 


470 


N-LlNKED 


(GLCNAC. . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


487- ■ 


487 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


512 


, 512 


N-LINKED 


(GLCNAC. . ,) 


(POTENTIAL) . 


FT 


CARBOHYD 


556 


, 556 


N-LINKED 


(GLCNAC, . ,) 


(POTENTIAL). 


FT 


CARBOHYD 


658 


' 658 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


666 


666 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


710 


710 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


748 


748 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL), 


FT 


CARBOHYD 


795 


795 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) , 


nl 


CARBOHYD 






N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


1142 


' 1142 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) , 


pi 






'n<in 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL). 


it 


UuvBUtlllJ 




i^i 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) , 






1271 


1271 


N-LINKED 


(GLCNAC. . ,) 


(POTENTIAL) , 


FT 


LAhcUnlU 






N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


1488 


1488 


N-LINKED 


(GLCNAC. . ,) 


(POTENTIAL) , 


FT 


VARSPLIC 


1562 


; 1571 


NFATLNYDGS -> KEAARCKEFS (IN SHORT 


FT 








ISOFORM) 






FT 


VARSPLIC 


1572 


'•2012 


MISSING (IN SHORT ISOFORM). 


FT 


CONFLICT 


1893 


2012 


HRPGDLIHLPPYLRMDFLLNRGGPGTSRDLSLGQACLEPQK 


FT 








SRTLKRPTVLEPIPMEAASSASSTREGQSWQPGAVATLPQR 


FT 








EGAELGQAAKMSSSQESLLDSRGHLKGNNPYAKSYTLV -> 


FT 








IGQVTSYICLHTLEWTFC (IN 


REF, 1), 


SQ 


SEQUENCE 


2012 : AA; 222259 MW; 0E33CFB781A08334 CRC64; 



Qy 



Query Match 8.74; Score 598; DB 1; Length 2012; 

Best Local Similarity 25.24; Pred, No. 4.3e-25; 

Matches 230; Conservative 131; Mismatches 365; Indels 188; Gaps 

16 YINFDKIPNASNLAPVIIEHPI DVWSRGSPATLNCGAKPS ■ TAKITWYKDG 66 

II: :l : h:l : III I :| I I : III I 

386 FVRKDKL-SAQDYVQWLEDGTPKIISAFSEKWSPAEPVSLMCNVKGTPLPTITWTLDD 444 

67 QPVITNKEQVNSHRI---VLDTGSLF-LLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNE 122 

I:: 'llll : I;; I :; : I I I I hi I I 
445 DPILKG— -GSHRISQMITSEGNWSYLNISS---SQVRDGGVYRCTANNSAGW"-- 493 

123 GSLKLAMLREOFR— -VRP-RTVQALGGEMAVLECSPPRGFPEPWSWRKDDKELRIQD 177 
I hi =11 : = h I : I hi : I h I 
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Qy 342 QPSPAYFW - - SKEGQQDLLFPSYVSADGRTKVSPTGTLT IEEVRQVDEGAYVCAGMNSAG 399 

I I I I! I :!! :| |:| I: I : I I |:| I I 

Db 716 YPVPTIVWKFSKGAGVPQFQP- - IALNGRIQVLSNGSLLIKHWEEDSGYYLCKVSNDVG 773 

Qy 400 SSLSKAALKAT FETKG RVQKKKSKMGKQKQKNVQS 1 1 KTLISAVTGNT PAKPPPTIEHGH 459 

: :l|: II : | : : 

Db 774 ADVSKS MYLTVKI PAMITSY 793 

Qy 460 QNQTLMV-GSSAILPCQASGKPTPGISWLRDGLPID ITDSRISQHSTGSLHIA 511 

. I II I : I I I: : I :: |: :: : : :| | 
Db 794 PNTTLATQGQKKEMSCTAHGEKP I IVRWEKEDRI INPEMARYLVSTKEVGEEVISTLQ IL 853 

Qy 512 DLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVKMPDPSNFPSSPTQPIIVNVT 571 

: hi ::l I I II III:: I I : I :| 

Db 854 PTVREDSGFFSCHAI NSYGEDRG I IQLTVQE PPDPPEIEIKDYK 897 

Qy 572 DTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNI PDYVASTEYRIKGLK 624 

: I I I Mill I: : :| : I ::| I : 

Db 898 ARTITLRWTM-GFDGNSPITGYDIE--CKNKSDSWDSAQRTKDVSPQLNSAT---IIDIH 951 

Qy 625 PSHSYMFVIRAENEKGIGTPSVSSALOTTSKPMQVALSDKNKMDMAIAEKRLTSEQLIK 684 
ft II :| = 1:1 II |: I |: II I : 

■) 952 PSSTYSIRMYAKNR--IGKSEPSNELTITADEAAPDG PPQEVH 992 

Qy 685 LEEVKT I NSTAVRLFWK - -KRKLEE - LIDGY YIKWRGPPRTNDNQYVNV — TSPSTEN 737 

II hi ::h II I: I: :| II I :| II |: || :| 
Db 993 LE--PISSQSIRVTWKAPKKHLQNGIIRGYQIGYR-EYSTGGNFQFNIISVDTSGDSEV 1048 

Qy 738 YWSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTA-EAPPSLPPEDVRIRMLNLT 795 

I : M III I : ■ I :| :::| I II ll|:|: : 
Db 1049 YTLDNLNKFTQYGLWQACNRA GTGPSSQEIITTTLEDVPSYPPENVQAIATSPE 1103 

Qy 796 TLRI SWKAPKADG ING I LKGFQ I VI VGQAPNNN — RNITTNERAASVTLFHLVTGMTY 851 

:: III : :llll:l|::: : :||ll : |: I I I 
Db 1104 SISI SWSTLSKEALNG I LQGFRVI YWANLMDGELGEIKNITTTQ - - PSLELDGLEKYTNY 1161 

Qy 852 KIRVAARSNGGVGV 865 

1:1 I : I II 
Db 1162 SIQVLAFTRAGDGV 1175 



CONT.CHICK 

ID CONT.CHICK STANDARD; PRT; 1010 AA. 

AC P14781; P10450; 

DT 01-MAR-1989 (Rel. 10, Created) 

DT 01-APR-1990 (Rel. 14, Last sequence update) 

DT 15-JUL-1999 (Rel, 38, Last annotation update) 

DE CONTACT IN PRECURSOR (NEURAL CELL RECOGNITION MOLECULE Fll) . 
Gallus gallus (Chicken). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 
OC Gallus. 
RN [1) 

RP SEQUENCE FROM N.A. 

RX MEDLINE=9Q180453; PubMed-2627374; 

RA Bruemmendorf T., Wolff J.M., Rainer F., Rathjer F.G.; 

RT "Neural cell recognition molecule Fll: homology with fibronectin type 

RT III and immunoglobulin type C domains."; 

RL Neuron 2:1351-1361(1989). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN-WHITE LEGHORN; 

RX MEDLINE=89008597; PubMed-3049624; 

RA Ranscht B., Dours M.T.; 

RT "Sequence of contactin, a 130-kD glycoprotein concentrated in areas 
RT of interneur.onal contact, defines a new member of the immunoglobulin 
RT supergene family in the nervous system."; 
RL J. Cell Biol. 107:1561-1573(1988). 
RN [3] 

RP GPI-ANCHOR. 

RX MEDLINE=89286606; PubMed-2735929; 

RA Wolff J.M., Bruemmendorf T., Rathjen F.G.; 



RT "Neural cell recognition molecule Fll: membrane interaction by 

RT covalently attached phosphatidylinositol."; 

RL Biochem. Biophys. Res. Commun. 161:931-938(1989). 

CC -!■ FUNCTION: MEDIATES CELL SURFACE INTERACTIONS DURING NERVOUS 

CC SYSTEM DEVELOPMENT. 

CC -I- SUBCELLULAR. LOCATION: ATTACHED TO THE MEMBRANE BY A GPI-ANCHOR. 

CC -I- SIMILARITY:; CONTAINS 6 IMMUNOGLOBULIN-LIRE C2-TYPE DOMAINS , 

CC -!- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC -!- CAUTION: REF.2 SEQUENCE DIFFERS FROM THAT SHOWN IN THE C-TERMINUS 
CC AND IS LONGER DUE TO A FRAMESHIFT. 

CC 1 

CC This SWISS-PROT-; entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

cc r 

DR EMBL; X14877; CAA33018.1; -. 

DR EMBL; Y00813; CAA68753.1; ALT FRAME. 

DR PIR; JU0094; JU0094. 

DR PIR; S01998; S01998. 

DR HSSP; P20241; 1CFB. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 4. 

DR PFAM; PF00047; ig; 6. 

KW Immunoglobulin. domain; Glycoprotein; Signal; GPI-anchor; 

KW Cell adhesion; Repeat. 



FT 


SIGNAL 


1' 


19 






FT 


CHAIN 


20-' 


? 


CONTACTIN. 




FT 


PROPEP 


? 


1010 


REMOVED IN MATURE FORM 




FT 


DOMAIN 


50- 


113 


IG-LIKB C2-TYPE DOMAIN 




FT 


DOMAIN 


143 


210 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


247" 


308 


IG-LIKB C2-TYPE DOMAIN 




FT 


DOMAIN 


336 


389 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


420- 


482 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


5io: 


581 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


593. 


599 


GLY/PRO-RICH. 




FT 


DOMAIN 


600. 


701 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


702 


803 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


804 


900 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


901 


996 


FIBRONECTIN TYPE-III, 




FT 


CARBOHYD 


200, 


200 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


249, 


249 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


329 


329 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


448' 


448 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


464 


464 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


485 : 


485 


N-LINKED (GLCNAC. , .) 


(POTENTIAL) , 


FT 


CARBOHYD 


512 


512 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


582' 


582 


N-LINKED (GLCNAC. , .) 


(POTENTIAL) . 


FT 


CARBOHYD 


924 


924 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 



SQ SEQUENCE 1010 'AA; 112507 MW; 2E38F071AF423AE1 CRC64; 



Query Match 8.7*; Score 597; DB 1; Length 1010; 

Best Local Similarity 23.6%; Pred. No. 1.8e-25; 

Matches 268; Conservative 140; Mismatches 364; Indels 364; Gaps 47; 

Qy 6 FYHTHTHTHTYI NFDK IPNASNIAPVI IEHPIDVW- - - SRGS PAT LNCGAKP 55 

h I :l :| : I II I III I ::|| |: 
Db 3 FFISHLVTLCFIFCVADSTHFSEEGN-KGYGPVFEEQPIDTIYPEESSDGQVSMNCRAR- 60 

Qy 56 STAKITWYKDGQPVITNKEQVNSHRIVL-DTGSLFLLKVNSGKNGKDSDAGAYYCVASN 113 

I I I ::h MM::: I III I II II 
Db 61 -AVPFPTYKWKLNNWDIDLTKDRYSMVGGRLVISNPEKSRDAGKYVCWSN 110 

Qy 114 EHGEVKSNEGSLKLAML REDFRVRPRTVQALGGEMAVLECSPPRGFPEPV-VSWR 167 

I hhl M I I : h I I III I II :|: : I 
Db 111 IFGTVRSSEATLSFGYLDPFPPEEHYEV1VRE GVGAVLLCEPPYHYPDDLSYRWL 165 

Qy 168 KDDKELRIQ-DMPRYTLHSDGNLIIDPVDRSDSGTYQCVANNMVGERVSNPA-RLSVFEK 225 



Best Available Copy 
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: I I I: I Mill 

166 LNEFPVFIALDRRRFVSQTNGNLYIANVEASDKGNYSCF-- 



1:1: III I 
■-VSSPSITKSVFSK 217 



Qy 226 PKFEQE — PKDMTVD VGAAVLFDCRVTGDPQPQITWKRKNEPMPVT - 269 

I: :: I h I :| I :| |:| |:: I : llll I 

Db 218 FIPLIPQADRAKVYPADIKVKFKDTYALLGQNVTLECFALGNPVPELRWSKYLEPMPATA 277 

Qy 270 RAYI AKD— 276 

I h II 
Db 278 EISMSGAVLKIFNIQYEDEGLYECEAENYKGKDKHQARVYVQASPEWVEHINDTEKDIGS 337 

Oy 277 NRG-LRIERVQPSDEGEYVCYARNPAGTLEASAHL 310 

:| III: : I I I I I I I : |:| I 
Db 338 DLYWPCVATGKPIPTIRWLKNGVSFRKGELRIQGLTFEDAGMYQCIAENAHGIIYANAEL 397 

Oy 311 RVQA-PPSFQTKPADQSVPA--GGIATFECILVGQPSPAYFWSKEGQQDLLFPSYVSADG 367 
:: I 11:1: I : : I II II I I : llll : |: I 

1 398 KIVASPPTFELNPMKKKILAAKGGRVIIECKPKAAPKPKFSWSR-GTELLVNGS 450 
368 RTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQKKKSKMGKQ 427 
I : hi I I "III II |: I : I Ml: 
Db 451 RIHIWDDGSLEIINVTKLDEGRYTCFAENNRGKANSTGVLEMTEATR 497 

Qy 428 KQKNVQSIIKYLISAWGNTPMPPPTIEHGHQNQTLMVGSSAILPCQASGRPTPGIS-- 485 

, I M II :| : I II II :: 
Db 498 ITLAPLNVDVTVGENATMQCIASHDPTLDLTFI 530 

Qy 486 WLRDGLPIDIT DSRISQHSTGSLHIADLKKPDTGVYTCIAOEDGESTWSASLT 539 

I :| II : :| I I I I ::: I Ml |: |: I 

Db 531 WSLNGFVIDFEKEHEHYERNVMIKSNGELLIKNVQLRHAGRYTCTAQTIVDNSSASADLV 590 

Qy 540 VEDHTSNAQFVRMPDPSNPPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYS 599 

III I jl M II I II: I 11:111 
Db 591 VRGP PGPPGGIRIEEIRDTAVALTWSR-GTDNHSPISKYTIQ-SK 633 

Qy 600 PDLGQTWFNIPDYVASTE— | YRIRGLKPSHSYMFVIRAENERGIGTPSVSSA 649 

1:1: III hi I I I I I I I I II: I 

Db 634 TFLSEEWKD AKTEPSDIEGNMESARVIDLIPWMEYEFRIIATNTLGTGEPSMPSQ 688 

Qy 650 LVTTSKPAAQVALSD [ KNRMDMAIAER RL 677 

.:l INI 1 I :l I I: 

Db 689 RIRTEGAPPNVAPSDVGGGGGSNRELTITWMPLSREYHYGNNFGYIVAFRPFGEKEWRRV 748 



i 



678 T SEQLIKLE EV 688 

I * :: :|:: I 

749 TVTNPEIGRYVHKDESMPPSTQYQVKVKAFNSKGDGPFSLTAVIYSAQDAPTEVPTDVSV 808 

689 KTINSTAVRLFWRKRKLEELIDGYYIR-WRGPPRTNDNQYVNVTSPSTENY--WSNLMP 745 

I "I: : : I |: :jll |: I : I I I I : I : III 
809 KVLSSSEISVSW-HHVTEKSVEGYQIRYWAAHDKEAAAQRVQV— SNQEYSIKLENLRP 864 

746 FTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTILR I 799 

II I ::| : :| II ::|::l Mill I II ::::l I 
865 NTRYHIDVSAFNS- ■ -AGYGPPSRTIDIIIRKAPPSQRP- --RI - - ■ -ISSVRSGSRYII 914 



Qy 800 SWKAPKADGINGILRGFQIVIVGQAPNNNRNITTNERAASVTLFHLVTGMTYKIRVAARS 859 

:| II ::|:::: : : :| : I : : I : I I I 
Db 915 TWDHVKAMSNES AVEGYKVLYRPDGQHEGKLFSTGKHT I EVP - - -VPSDGEYWEVRAHS 971 

Qy 860 NGGVG VSHGTSEVIMNQDTLEKHLAAQQENESFLYGLINKSHVPVIVIVA 909 

III :| I: I : I II: :| : ::l 

Db 972 EGGDGEVAQIKISGATAGV PTLILGLV — LPALGVLA 1006 



RESULT 7 
CONTJOUSE 
ID CONT_MOUSE 
AC P12960; 
DT 01-JAN-1990 (Rel. 13, Created) 
DT 01-JAN-1990 (Rel. 13, Last sequence update) 
DT 15-JUL-1999 (Rel. 38, Last annotation update) 
DE CONTACT IN PRECURSOR (NEURAL CELL SURFACE PROTEIN F3). 
GN CNTNl. 



STANDARD; PRT; 1020 AA. 



Mus musculus (Mouse). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrate; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
[1] 

SEQUENCE FROM N.<A. 
STRAIN-C57BL/6; TISSUE-BRAIN; 
MEDLINE-89340657; PubMed-2474555; 

Gennarini G,, Cibelli G., Rougon G., Mattel M.-G., Goridis C; 
"The mouse neuronal cell surface protein F3: a phosphatidylinositol- 
anchored member .of the immunoglobulin superf amily related to chicken 
contactin."; 

J. Cell Biol. 109:775-788(1989). 

-I- FUNCTION; MEDIATES CELL SURFACE INTERACTIONS DURING NERVOUS 

SYSTEM DEVELOPMENT. 
-I- SUBCELLULAR , LOCATION: ATTACHED TO THE MEMBRANE BY A GPI-ANCHOR. 
■I- MISCELLANEOUS: F3 SHARES WITH Ll, N-CAM, MAG, AND OTHER CELL 

ADHESION MOLECULES FROM NERVOUS TISSUE THE L2/HNM CARBOHYDRATE 

EPITOPE. ■ 

-I- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN- LIKE C2-TYPE DOMAINS. 
-I- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS. 

This SWISS-PROT 'entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http;//www,isb-sib.ch/announce/ 
or send an email to licenseSisb-sib.ch) , 

EMBL; X14943; CAA33075.1; -. 
PIR; S05944; S05944. 
HSSP; P40189; 1BQU. 
MGD; MGI: 105980; CNTNl. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 4, 
PFAM; PF00047; ig; 6. 

Immunoglobulin domain; Glycoprotein; Signal; GPI-anchor; 
Cell adhesion; Repeat. 
SIGNAL 



DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DISULFID 
DISULFID 
DISULFID 
DISULFID 
DISULFID 
DISULFID 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 

SEQUENCE 1020 AA; 113388 MW; 9DCDAA40EAA4CBC7 CRC64; 



1. 


20 






21- 


? 


CONTACTIN. 




?.' 


1020 


REMOVED IN MATURE FORM 




58 


121 


IG-LIKE C2-TYPE DOMAIN 




151* 


218 


IG-LIKE C2-TYPE DOMAIN 




256 


317 


IG-LIKE C2-TYPE DOMAIN 




345' 


398 


IG-LIKE C2-TYPE DOMAIN 




429 


491 


IG-LIKE C2-TYPE DOMAIN 




519,' 


592 


IG-LIKE C2-TYPE DOMAIN 




604' 


611 


GLY/PRO-RICH. 




611-. 


712 


FIBRONECTIN TYPE- II I. 




713. 


814 


FIBRONECTIN TYPE- II I. 




815* 


910 


FIBRONECTIN TYPE-III. 




911 ; . 


1006 


FIBRONECTIN TYPE-III. 




65; 


114 


BY SIMILARITY, 




158.: 


211 


BY SIMILARITY. 




263, 


310 


BY SIMILARITY, 




352 


391 


BY SIMILARITY, 




436. ; 


484 


BY SIMILARITY. 




526: 


585 


BY SIMILARITY, 




208. 


208 


N-LINKED (GLCNAC, . .) 


(POTENTIAL), 


258 


258 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


338. 


338 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


457' 


457 


N-LINKED (GLCNAC, . .) 


(POTENTIAL). 


473 


473 


N-LINKED (GLCNAC, . .) 


(POTENTIAL). 


494- 


494 


N-LINKED (GLCNAC, . .) 


(POTENTIAL) . 


521, 


521 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


593- 


593 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) , 


935' 


935 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 



Query Match 8.7%; Score 595; DB 1; Length 1020; 

Best Local Similarity 24.0%; Pred. No. 2.3e-25; 
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Matches 263; Conservative 127; Mismatches 381; Indels 324; Gaps 43; 

Qy 4 LGFYHTH-THTHTYINFDKIPNASNLAPVIIEHPIDVWSRGS—PATLNCGAKPSTM 59 

II : I : I II |: I ||: : I :||| |: I 

Db 19 LGDFTWHRRYGHGVSEEDK GFGPIFEEQPINTIYPEESLEGRVSLNCRARASPFP 73 

Qy 60 ITWYK — DGQPVITNKEQVNSHRIVLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEH 115 

: II :l HI I : hi : I III II hill : 

Db 74 V'-YKWRMNKGDVDLTN DRYSMVGGNLVI NNPDKQKDAGVYYCLASNNY 120 

Qy 116 GEVKSNEGSLKLAMLREDFRVRPR-TVQALGGEMAVLECSPPRGFPEPV-VSWRKDDKEL 173 

I hi I :| I : I I h h II I II lh : I :: : 
Db 121 GMVRSTEATLSFGYL-DPFPPEERPEVKVKEGRGMVLLCDPPYHFPDDLSYRWLLNEFPV 179 

Qy 174 RI -QDNPRYTLHSDGNLI IDPVDRSDSGTYQCVANNMVGERVSNPA- RLSVFEK 225 

I I h -III I I: II I I I l|:|: III I 

Db 180 FITMDKRRFVSQTNGNLYIANVESSDRGNYSCF VSSPSITKSVFSKFIPLIP 231 



Qy 226 -PKFEQEP-- 



Db 

Job 

toy 



■ -KDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMPVTRAYIAK 275 
h :l lh :| I :| hi I I h: Ml I I |: 

232 IJERTTKPYPADIWQFKDIYTMMGQNVTLECFALGHPVPDIRWRKVLEPMPST-AEIST 290 

276 DNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVPAGGTATF 335 

} hi :l III II I II : I : III I : III : 
291 SGAVLKIFNIQLEDEGLYECEAENIRGKDKHQARIYVQAFPEWVEHINDTEVDIGSDLYW 350 

336 ECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGM 395 

I' hi I I I I II I I : :| : I I I 

351 PCIATGRPIPTIRWLKNGY SY HKGELRLYDVTFENAGMYQCIAE 394 

396 NSAGSSLSKAALK ATFETRGRVQRKKSRMGKQRQKNVQSIIKYLISAVTGNTPAK 450 

I: II : I II III I hi ::: 
395 NAYGSIYANAELKILAIAPIFE MNPMKKK ILAA 427 

451 PPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQHSTGSLHI 510 

I I: I: I I II : I : III III I 
428 KGGRVIIECKPKAAPKPKFSWSK-GTEWLVNSSRILIWEDGSLEI 471 

511 ADLKRPDTGVYTCIARNEDGESTWSASLTVEDHT SNAQFVRMPD 554 

:: : I hill hi |:: : :| : : I : I I 

472 NNITRNDGGIYTCFAENNRGKANSIGTLVITNPTRIILAPINADITVGENATMQCAASFD 531 

555 PS NF 558 

h II 

532 PALDLTFVWSFNGYVIDFNKEITHIHYQRNFMLDANGELLIRNAQLKHAGRYTCTAQTIV 591 

559 PSSPTQPIIVNVTDTEVELHWKAPSTSGAGPITGYIIQYYSPDLGQ 604 

II I :: I I I I: I : : ||: I II 
552 DNSSASADLWRGPPGPPGGLRIEDIRATSVALTWSRGSDNHS-PISKYTIQ-TKTILSD 649 

1605 TWFNI - - -PDYVASTEYRIKG- -LKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQ 659 

I : I U I II I I : I I I I lh I : I I 
650 DWKDAKTDPPIIbGNMESAKAVDLIPWMEYEFRWATNTLGTGEPSIPSNRIKTDGAAPN 709 

660 VALSD RNRMDMAIAEKRLTSEQLIRLE 686 

II II I :| I h h 

710 VAPSDVGGGGGTNRELTITWAPLSREYHYGNNFGYIVAFKPFDGEEWKKVTVTNPDIGRY 769 

687 EVKTINSTAVRL 698 

II ::h : : 

770 VHKDETMTPSTAFQVKVKAFNNKGDGPYSLVAVINSAQDAPSFiPTEVGVKVLSSSEISV 829 

699 FWRKRKLEELIDGYYIR-WRGPPRThTONQYTOVTSPSTENYV--VSNLMPFTNYEFFVIP 755 

I I II:::: I |: I I : I III : I : Ihl I I I 
830 HW-KHVLEKIVESYQIRYWAGHDREAAAHRVQVTS— QEYSARLENLLPDTQYFIEVGA 885 

756 YRSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLR ISWKAPKADGI 809 

:| : I h :: I :|||| II II ::::| hi I 
886 CNS- • -AGCGPSSDVIETFTRKAPPSQPP- - -RI- - "ISS.VRSGSRYIITWDHWALSN 935 

810 NGILKGFQIVIVGQAPNNNRNITTNERAASVTLFHLWGMTYRIRVAARSNGGVGVSHGT 869 

: h:h :: : :|:: : I : I : I I 1 : 1 1 II 
936 ESTVTGYKILYRPDGQHDGKLFSTHRHSIEVP— IPRDGEYWEVRAHSDGGDGV— V 989 



870 SEV- IMNQDTLEKHL 883 

hi I II I 
990 SQVKISGVSTLSSSL 1004 



CONTJUMAN 

ID CONTJUMAN .STANDARD; PRT; 1018 AA, 

AC Q12860; Q12861;' Q14030; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel, 35, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE CONTACT IN PRECURSOR (GLYCOPROTEIN GP135). 

GN CNTN1. 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
RN [1] 

RP SEQUENCE FROM N.A., AND PARTIAL SEQUENCE. 
RC TISSUE-BRAIN; .' 
RX MEDLINE-95048335; PubMed-7959734; 
RA Berglund E.O., 'Ranscht B.; 

RT "Molecular cloning and in situ localization of the human contactin 
RT gene (CNTN1) on chromosome 12qll-ql2."; 
RL Genomics 21:571-582(1994). 
RN [2] 

RP SEQUENCE FROM N.A., AND CHARACTERIZATION. 
RX MEDLINE-94217459; PubMed-8164510; 
RA Reid R.A., Hemperly J.J.; 

RT "Identification, and characterization of the human cell adhesion 

RT molecule contacting; 

RL Brain Res. Mol. Brain Res. 21:1-8(1994). 

-!- FUNCTION: MEDIATES CELL SURFACE INTERACTIONS DURING NERVOUS 

SYSTEM DEVELOPMENT , 
-I- SUBCELLULAR LOCATION: ATTACHED TO THE MEMBRANE BY A GPI-ANCHOR. 
-I- ALTERNATIVE . PRODUCTS : 2 ISOFORMS; 1 (SHOWN HERE) AND 2; ARE 

PRODUCED BY ALTERNATIVE SPLICING. 
-I- SIMILARITY :, CONTAINS 6 IMMUNOGLQBULIN-LIRE C2-TYPE DOMAINS. 
-I- SIMILARITY: 'CONTAINS 4 FIBRONECTIN TYPE III -LIKE DOMAINS, 

This SWISS-PROT. entry is copyright. It is produced through a collaboration 
between the Swiss institute of Bioinformatics and the EMBL outstation • 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 



EMBL; U07819; AAA67920 
EMBL; U07820; AAA67921 
EMBL; 221488; CAA79696 
HSSP; P40189; 1BQU. 
MIM; 600016; -; 
INTERPRO; IPR001777; - 
INTERPRO; IPR003006; - 
PFAM; PF00041; ,fn3; 4. 
PFAM; PF00047; dg; 6. 
Immunoglobulin 'domain; 
Cell adhesion; Repeat; 



Glycoprotein; Signal; GPI-anchor; 
Alternative splicing. 



SIGNAL 


1 


20 




CHAIN 


21 


? 


CONTACTIN. 


PROPEP 


?. 


1018 


REMOVED IN MATURE FORM. 


DOMAIN 


58 


121 


IG-LIKE C2-TYPE DOMAIN, 


DOMAIN 


151 


218 


IG-LIKE C2-TYPE DOMAIN, 


DOMAIN 


256 


317 


IG-LIKE C2-TYPE DOMAIN, 


DOMAIN 


• 345 


398 


IG-LIKE C2-TYPE DOMAIN. 


DOMAIN, 


429 


491 


IG-LIKE C2-TYPE DOMAIN, 


DOMAIN 


519, 


590 


IG-LIKE C2-TYPE DOMAIN. 


DOMAIN 


602 


609 


GLY/PRO-RICH, 


DOMAIN 


609 


710 


FIBRONECTIN TYPE" I II, 


DOMAIN 


711 


812 


FIBRONECTIN TYPE-III. 


DOMAIN 


813. 


908 


FIBRONECTIN TYPE-III. 
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FT DOMAIN 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT CARBOHYD 

FT CMBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT VARSPLIC 

•CONFLICT 
SEQUENCE 



909 


1004 


65 


114 


158 


211 


263 


310 


352 


391 


436 


484 


526 


583 


208 


208 


258 


258 


338 


338 


457 


«• 457 


473 


473 


494 


494 


521 


521 


591 


591 


933 


933 


21 


31 


798 


798 



1018 AA; 113320 MW; 



FIBRONECTIN TYPE-III. 
BY SIMILARITY. 
BY SIMILARITY, 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
N-LINKED (GLCNAC. . . 
N-LINKED (GLCNAC. . , 
N-LINKED (GLCNAC, , . 
N-LINKED (GLCNAC. . . 
N-LINKED (GLCNAC. , . 
N-LINKED (GLCNAC. . . 
N-LINKED (GLCNAC. , , 
N-LINKED (GLCNAC. . , 
N-LINKED (GLCNAC, , , 
MISSING (IN ISOFORM 2), t 
V -> L (IN REF. 2). 

4B8FDC5BFD434ED5 CRC64, 



) (POTENTIAL). 
) (POTENTIAL), 
) (POTENTIAL). 
) (POTENTIAL) . 
) (POTENTIAL). 
) (POTENTIAL). 
) (POTENTIAL). 
(POTENTIAL). 
(POTENTIAL) , 



Query Match 8.6%; Score 593; DB 1; Length 1018; 

Best Local Similarity 23.5%; Pred. No. 3e-25; 

Matches 246; Conservative 126; Mismatches 361; Indels 316; Gaps 36; 

Oy 30 PVIIEHPIDWVSRGS--PATLNCGAKPSTAKI-TWYKDGQPVITNKEOVNSHRIVLDT 85 

I: II!:: I :MI h I : I : I : | | : 
Db 41 PIFEEQPINTIYPEESLEGKVSLNCRARASPFPVYKWRMNNGDV DLTSDRYSMVG 95 



86 GSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAML REDFRVRP 138 

hi : I III 111:111 :| I : I I : f I : ||: 

96 GNLVI NNPDKQKDAGIYYCLASNNTGMVRSTEATLSFGYLDPFPPEERPEVRVKE 150 

139 RTVQALGGEMAVLECSPPRGFPEPV-VSWRKDDKELRI-QDMPRYTLHSDGNLIIDPVDR 196 

I: II I II II: : I :: : I I |: ::||| I |: 

151 GKGMVLLCDPPYHFPDDLSYRWLLNEFPVFITMDKRRFVSQTNGNLYIANVEA 203 



197 SDSGTYQCVANNMVGERVSNPA- RLSVFEK PKFEQEP KDMTVDV 239 

II I I I MM: III I I: :| ||: : 

204 SDKGNYSCF VSSPSITKSVFSKFIPLIPIPERTTKPYPADIWQFKDVYALM 255 

240 GAAVLFDCRVTGDPOPQITWKRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARN 299 

I I H 1:1 I I h: Nil I I |: |:| :| II I I I 
256 GONVTLECFALGNPVPDIRWRKVLEPMPST-AEISTSGAVLKIFNIQLEDEGIYECEAEN 314 

3 00 PAGTLEASAHLRVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLF 359 

I : I : III I = III : I |:| I I I I 
315 IRGKDKHQARIYVOAFPEWVEHINDTEVDIGSDLYWPCVATGKPIPTIRWLKNGY 369 

360 PSYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALK ATFETK 414 

II:: : I M I: |; : M III 
370 AYHKGELRLYDVTFENAGMYQCIAENTYGAIYANAELKILALAPTFE" 416 

415 GRVQKKKSKMGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPC 474 

I hi :::| I h I 

417 MNPMKKK ILAA KGGRVIIEC 436 

475 QASGKPTPGISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTW 534 

: I I II : I : III III::: hill |:| :: 
437 KPKAAPKPKFSWSK-GTEWLVNSSRILIWEDGSLEINNITRNDGGIYTCFAENNRGKANS 495 

535 SASLTVEDHT SNAQFVRMPDPS 556 

: :| : I I : I ||: 

496 TGTLVITDPTRIILAPINADITVGENATMQCAASFDPALDLTFVWSFNGYVIDFNKENIH 555 

557 — NF PSSPTQPIIVNV 570 

II II I 

556 YQRNFMLDSNGELLIRNAQLKHAGRYTCTAQTIVDNSSASADLWRGPPGPPGGLRIEDI 615 

571 TDTEVELHWNAPSTSGAGPITGYIIQ — YYSPDLGQTWFNIPDYVASTE-YRIKGLKPS 626 

I I I h I :: II: I II I | : Mill 
616 RATSVALTWSRGSDNHS-PISKYTIQTKTILSDDWKDAKTDPPIIEGNMEAARAVDLIPW 674 



Qy 627 HSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSD 664 

II : I ! II II: I : I I II 1 1 
Db 675 MEYEFRWATNTLGRGEPSIPSNRIKTDGAAPNVAPSDVGGGGGRNRELTITWAPLSREY 734 

0y 665 --KNKMDMAIAEKRLTSEQLIKLE 686 

I :| I h I: 
Db 735 HYGNNFGYIVAFKPFDGEEWKKVTVTNPDTGRYVHKDETMSPSTAFQVKVKAFNNKGDGP 794 

Oy 687 EVKTINSTAVRLFWKKRKLEELIDGYYIK-WRGPPRTND 724 

II ::h : : I : II:::: I h I : 
Db 795 YSLVAVINSAQDAPSEAPTEVGVKVLSSSEISVHW-EHVLEKIVESYQIRYWAAHDKEEA 853 

Qy 725 NQYVNVTSPSIENYV-VSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSL 782 

I HI ■ I : IN II I : I 1 1 : : : I :|||| 

Db 854 ANRVQVTS- "QEYSARLENLLPDTQYFIEVGACNS- - -AGCGPPSDMIEAFTKKAPPSQ 907 

Qy 783 PPEDVRIRMLNLTTLR ISWKAPKADGINGILKGFQIVIVGQAPNNNRNITTNER 836 

II II hi I : |:::: :: : :|:: 

Db 908 PP---RI — ISSVRSGSRYIITWDHWALSNESTVTGYKVLYRPDGQHDGKLYSTHKH 960 

Qy 837 AASVT LF HLVTGMT YKIRVAARS NGG VGV 865 

: I : ' I : I I 1 : 1 1 II 
Db 961 SIEVP— IPRDGEYWEVRAHSDGGDGV 986 



RESULT 9 
AXOIJAT ■ 

ID AX01.RAT ■ STANDARD; PRT; 1040 AA. 
AC P22063; 

DT 01-AUG-1991 (Rel. 19, Created) 
DT 01-AUG-1991 (Rel. 19, Last sequence update) 
DT 15-JUL-1999 (Rel. 38, Last annotation update) 
DE AXONIN-1 PRECURSOR (AXONAL GLYCOPROTEIN TAG-1). 
GN TAXI. 

OS Rattus norvegicus (Rat). 

OC Eukaryota; Meta'zoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 
RN [1] 

RP SEQUENCE FROM N,'A,, AND SEQUENCE OF 31-41. 

RC TISSUE-SPINAL CORD; 

RX MEDLINE-9019989,0; PubMed-2317872; 

RA Fur ley A.J., Morton S.B., Manalo D., Karagogeos D,, Dodd J., 

RA Jessell T.M.; ' 

"The axonal glycoprotein TAG-1 is an immunoglobulin superfaiily 
member with neurite outgrowth-promoting activity."; 
Cell 61:157-170(1990) . 

-I- FUNCTION: MAY PLAY A ROLE IN THE INITIAL GROWTH AND GUIDANCE OF 

AXONS. MAY 3E INVOLVED IN CELL ADHESION, 
-I- SUBCELLULAR' LOCATION: ATTACHED TO THE NEURONAL MEMBRANE BY A 

GPI-ANCHOR AND IS ALSO RELEASED FROM NEURONS . 
-!- TISSUE SPECIFICITY: IN NEURAL TISSUES IN EMBRYOS, AND IN ADULT 

BRAIN, SPINAL CORD AND CEREBELLUM. 
-!- DEVELOPMENTAL STAGE: TRANSIENTLY EXPRESSED ON A SUBSET OF AXONS 

IN THE DEVELOPING RAT NERVOUS SYSTEM. 
-!- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN - LI KE C2-TYPE DOMAINS. 
-I- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS , 



This SWISS-PROT-lentry is copyright. It is produced through a collaborate 
between the Swiss Institute of Bioinformatics and the EMBL outstetior - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to licenseSisb-sib.ch). 

EMBL; M31725; AAA42201.1; -. 
PIR; A34695; A34695. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 4. 
PFAM; PF00047; ig; 6. 

Immunoglobulin domain; Glycoprotein; Signal; GPI-anchor; 
Cell adhesion; Repeat, 
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FT 


SIGNAL 


I 


30 




Qy 


FT 


CHAIN 


31 


71015 


AXONIN-1 . 




FT 


PROPEP 


71016 


1040 


REMOVED IN MATURE FORM, 


Db 


FT 


DOMAIN 


56 


120 


IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


150 


218 


IG-LIKE C2-TYPE DOMAIN. 


Qy 


FT 


DOMAIN 


256 


315 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


343 


404 


IG-LIKE C2-TYPE DOMAIN. 


Db 


FT 


DOMAIN 


435 


497 


IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


525 


596 


IG-LIKE C2-TYPE DOMAIN. 


Qy 




DOMAIN 


608 


614 


GLY/PRO-RICH. 


FT 


DOMAIN 


613 


708 


FIBRONECTIN TYPE -III. 


Db 


FT 


DOMAIN 


716 


811 


FIBRONECTIN TYPE-III. 




lJ 


TVMJ71TM 

DOMAIN 


818 


910 


FIBRONECTIN TYPE-III, 


Qy 






911 


1005 


f IBKUNtLllN ilVti III, 


FT 




796 




LdLL AllflttlHtiNi Silt (rUlCiNllALJ , 


Db 


FT 




78 


78 


14 LINKIMJ (bU,Ml, . , ) (FwllNllALJ . 








2oo 


200 


H LINKfiU (bbtNAt. . .) (rUlfcNIIAL) , 


Qy 


FT 




206 


206 


M.TTH7cn inrmr \ /DfwtwTir \ 

a LIPIMjU (ljU»WftU. . . ) (tVlIiNllAJjj , 




FT 


CARBOHYD 


463 


463 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


Db 


FT 


C^ARBOHYD 


479 


479 


N-LINKED (GLCNAC. . .) (POTENTIAL). 






CARBOHYD 


500 


^ 500 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Qy 


| 


CARBOHYD 


527 


527 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




CARBOHYD 


777 


777 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Db 




CARBOHYD 


832 


832 


N-LINKED (GLCNAC, . .) .(POTENTIAL). 




FT 


CARBOHYD 


920 


920 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Qy 


FT 


CARBOHYD 


942 


942 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


SQ 


SEQUENCE 


1040 


AA; 113042 MW; 6E707EF6614CB4FB CRC64; 


Db 



Query Match 8.5*; Score 585; DB 1; Length 1040; 

Best Local Similarity 24.0%; Pred. No. 8,6e-25; 

Matches 238; Conservative 113; Mismatches 381; Indels 258; Gaps 36; 

Qy 26 SNLAPVIIEHPIDVW— SRGSPATLNCGAKPS-TAKITWYKDGQPVITNKEQVNSHRI 81 

: !: I II :: I 1 1 I h I I I : I : I I : |: : 
Db 35 ATFGPIFEEQPIGLLFPEESAEDQVTLACRARASPPATYRWKMNGTDM--NLEPGSRHQL 92 

Qy 82 VLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLRE 132 

: 1:1 " I III I hill Mill: hi 

Db 93 M--GGNLVIMSPT KTQDAGVYQCLASNPVGTWSKEAVLRFGFLQEFSKEERDPV 145 



Qy 133 ■ 



Db 146 KTHEGWGVMLPCNPPAHYPGLSYRWLLNEFPNFIPTDGRHFVSQTTGNLYIARTNASDLG 205 
I 

Qy 133 DFRVR : PRTVQALGGEMAVLECSP 155 

II : t I II h III 

Db 206 NTSCIiATSHMDFSTKSWSKFAQLNLAAEDPRLFAPSIKARFPPETYALVGQQVTLECF- 264 

1156 PRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANNMVGERVS 215 
111:11111 I:: I I I I llhl I I II : 

265 AFGNPVPRIKWRKVDGSL----SPQWAT-AEPTLQIPSVSFEDEGTYECEAENSKG-RDT 318 

Qy 216 NPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMPVTRAYIAK 275 

I: I :|:: : I l:|: : : I I hi : I I II :| 
Db 319 VQGRIIVQAQPEWLKVISDTEADIGSNLRWGCAAAGKPRPMVRWLRNGEP LAS 371 

Qy 276 DNR GLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQA-PPSFQTKPADQSVP 328 

II II :: I I I I I I lh III I III If: I : :| 
Db 372 QNRVEVLAGDLRFSKLSLEDSGMYQCVAENKHGT I YASAELAVQALAPDFRQNPVRRLI P 431 

Qy 329 A--GGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVRQVD 386 

I II : I I III : : I h III I : : I 

Db 432 AARGGEISILCQPRAAPKATILWSKGTE ILGNSTRVTVTSDGTLIIRNISRSD 484 

Qy 387 EGAYVCAGMNSAGSSLSKAALKATFETKGRVQKKKSKMGKQKQKNVQSIIKYLISAVTGN 446 

111111:11 II 
Db 485 EGKYTCFAENFMGKANSTGILSVRDATK 512 

Qy 447 TPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGI--SWLRDGLPIDITD SR 499 

I : : II : I III II : :| I III I 
Db 513 ITLAPSSADINVGDNLTLQCHASHDPTMDLTFTWTLDDFPIDFDKPGGHYRR 564 



500 ISQHST-GSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNF 558 

I I I I.I : I 1 1 1 : 1 : II I II I : I 

565 ASAKETIGDLTILNAHVRHGGKYTCMAQ TWDGTSKEATVLVRGP- * - 609 

559 PSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTE- 617 

I I :: ;: II hi h II I :| : I 

610 PGPPGGVWRDIGDTTVQLSWSR-GFDNHSPIAKYTLQARTPPSGK-WKQVRTNPVNIEG 667 

618 — -YRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIA 673 

:: IN I I : I I I I II I: : I : II I : I 
668 NAETAQVLGIMPWMDYEFRVSASNILGTGEPSGPSSKIRTKEAVPSVAPSGLSGGGGAPG 727 

674 EKRLTSEQLIKLEEVKTINSTAVRLFWKKRKLEELID-GYYIKWRGPPRTNDNQYVNVT 731 

I II: llll:: I II : :| I : : 

728 E LI INWTPVSREYQNG DGFGYLLSFR- - -RQGSSSWQTAR 764 

732 SPSTE-NYW--SNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDV 787 

I : I I' :: hi :| : h I I : I :|l I : I I 
765 VPGADAQYFVYGNDSIQPYTPFEVKIRSYN— RRGDGPESLTALVYSAEEEPRVAPAKV 821 

788 RIRMLNLTTLRISWKAPKADGINGILKGFQIVIVGQAPNNNRNITTNERAASVTLFHLVT 847 

: : : ::':lh I Mill |::| :| I I : II 

822 WAKGSSSSEMNVSWE-PVLQDMNGILLGYEIRY-WKAGDNEAAADRVRTAGLDTSARVT 878 

848 GMT — YKIRVAARSNGGVGYSHGTSEVI 873 

h I :'"l I : II: ::: : 
879 GLNPNTKYHVTVRAYNRAGTGPASPSADAM 908 



RESULT 10 
CAMLJUMAN 

ID CAMLJOMAN STANDARD; PRT; 1257 AA, 

AC P32004; 

DT 01-JUL-1993 (Rel. 26, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEURAL CELL ADHESION MOLECULE LI PRECURSOR (N-CAM Ll) . 

GN L1CAM OR CAMLl OR MIC5. 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

RN [1] 

RP SEQUENCE FROM N,A. 

RX MEDLINE»92031698; PubMed-1932117; 

RA Kobayashi M., Miura M,, Asou H., Uyemura K.; 

RT "Molecular cloning of cell adhesion molecule Ll from human nervous 

RT tissue: a comparison of the primary sequences of Ll molecules of 

RT different origin."; 

RL Biochim. Biophys. Acta 1090:238-240(1991). 

RN [2] 

RP SEQUENCE FROM N.A, 

RA Rosenthal A., Cbutelle O., Drescher B.; 

RL Submitted (APR-1994) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-92329299; PubMed=1627459; 

RA Reid R.A., Hemperly J. J.; 

RT "Variants of human Ll cell adhesion molecule arise through alternate 

RT splicing of RNA, " ; 

RL J. Hoi. Neurosci. 3:127-135(1992). 

RN [4] 

RP SEQUENCE OF 353-1176 FROM N.A. 

RX MEDLINE-92020233; PubMed-1923824; 

RA Rosenthal A., Mackinnon R.N., Jones D.S.C.; 

RT "PCR walking from microdissection clone M54 identifies three exons 

RT from the human gene for the neural cell adhesion molecule Ll 

RT (CAM-LI)."; 

RL Nucleic Acids Res. 19:5395-5401(1991). 

RN [5] 

RP SEQUENCE OF 332-371 FROM N.A. 

RX MEDLINE-90353957; PubMed-2387585; 

RA Djabali M., Mattel M.-G., Nguyen C, Roux D. r Demengeot J, , 

RA Denizot F., Moos' M., Schachner M,, Goridis C, Jordan B.R.; 
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# 



"The gene encoding LI, a neural adhesion molecule of the 
immunoglobulin family, is located on the X chromosome in mouse and 
man."; 

Genomics 7:587-593(1990). 

[6] 

SEQUENCE OF 1030-1257 FROM N.A. 
MEDLINE-91132183; PubMed=1993895; 

Harper J.R., Prince J.T., Healy P. A., Stuart J.K., Nauman S.J., 
Stallcup W.B.; 

"Isolation and sequence of partial cDNA clones of human Ll: homology 
of human and rodent Ll in the cytoplasmic region."; 
J. Neurochem. 56:797-804(1991), 
[7] 



? 20-36. 

MEDLINE=88298876; PubMed-3136168; 

Wolff J.M., Frank R., Mujoo K., Spiro R.C., Reisfeld R.A., 

Rath j en F.G.; 

"A human brain glycoprotein related to the mouse cell adhesion 
molecule Ll."; 

J. Biol. Chem, 263:11943-11947(1988). 

[8] 

VARIANT HSAS TYR-264. 
MEDLINE-94004956; PubMed-8401576; 

Jouet M. , Rosenthal A., Macfarlane J., Kenwrick S., Donnai D.; 

"A missense mutation confirms the Ll defect in X-linked hydrocephalus 

(HSAS)."; 

Nat. Genet. 4:331-331(1993), 

[9] 

VARIANT HSAS/MASA LED- 1194. 
MEDLINE»95187172; PubMed-7881431; 

Fransen E., Schrander-Stuipel C, Vits L,, Coucke P., van Camp G., 
Willems P.J.; 

"X-linked hydrocephalus and MASA syndrome present in one family are 
due to a single missense mutation in exon 28 of the LlCAM gene."; 
Hum. Hoi. Genet. 3:2255-2256(1994). 
[10] 

VARIANTS HSAS GLN-184 AND ARG-452, AND VARIANT MASA GLN-210. 
MEDLINE-950046Q8; PubMed-7920659; 

Jouet M., Rosenthal A., Armstrong G., Macfarlane J., Stevenson R., 
Paterson J., Metzenberg A., Ionasescu V,, Temple K,, Kenwrick S.? 
"X-linked spastic paraplegia (SPG1), MASA syndrome and x-linked 
hydrocephalus result from mutations in the Ll gene."; 
Nat. Genet. 7:402-407(1994). 
[U] 

VARIANTS MASA GLN-210 AND ASN-598. 

MEDLINE-95004609; PubMed-7920660; 

Vits L. , van Camp G., Coucke P., Fransen E., de Boulle R., 

Reyniers E,, Korn B., Poustka A., Wilson G., Schrander-Stumpel C, 

Wintef R.M., Schwartz C, Willems P.J.; 

"MASAjsyndrome is due to mutations in the neural cell adhesion gene 

LlCAM "; 

Nat. Genet. 7:408-413(1994). 
112] 

VARIANTS HSAS/MASA S-9; S-121; K-309; F-768; L-941 AND C-1070. 
MEDLINE=95282776; PubMed-7762552; 

Jouet M., Moncla A., Paterson J., McKeown C, Fryer A., Carpenter N. , 

Holmberg E, , Wadelius C, Kenwrick S,; 

"New domains of neural cell-adhesion molecule Ll implicated in 

X- linked hydrocephalus and MASA syndrome."; 

Am. J. Hum. Genet. 56:1304-1314(1995), 

[13] 

VARIANTS HSAS/MASA Q-184; Q-210; Y-264; R-452; N-598 AND L-1194 . 
MEDLINE-96153146; PubMed-8556302; 

Fransen E., Lemmon v., van Camp G,, Vits L,, Coucke P., Willems P.J.; 
"CRASH syndrome: clinical spectrum of corpus callosum hypoplasia, 
retardation, adducted thumbs, spastic paraparesis and hydrocephalus 
due to mutations in one single gene, Ll."; 
Eur. J. Hum. Genet. 3:273-284(1995). 
[14] 

. ERRATUM. 

Fransen E., Lemmon v., van Camp G., Vits L., Coucke P., Willems P.J.; 

Eur. J. Hum. Genet. 4:126-126(1996). 

[15] 



RP VARIANTS HSAS/MASA/SPGl SER-179 AND ARG-370. 

RX MEDLINE-96057511; PubMed-7562969; 

RA Ruiz J.C., Cuppens H., Legius E., Fryns J. -P., Glover T. , Marynen P., 

RA Cassiman J. -J,;, 

RT "Mutations in LI-CAM in two families with X linked complicated 

RT spastic paraplegia, MASA syndrome, and HSAS."; 

RL J, Med. Genet. 32:549-552(1995). 

RN [16] 

RP VARIANTS HSAS CYS-194 AND LEO-240. 

RX MEDLINE=9708337.Q; PubMed-8929944; 

RA Gu S.-M., Orth u., veske A,, Enders H., Kluender K. , Schloesser M., 

RA Engel W., Schwinger E., Gal A.; 

RT "Five novel mutations in the LlCAM gene in families with X linked 

RT hydrocephalus . ";' 

RL J. Med. Genet. 33:103-106(1996). 

RN [17] 

RP VARIANTS HSAS Q-184; V-439--T-443 DEL; C-784 AND L-936--L-948 DEL. 

RX MEDLINE-97338664; PubMed=9195224; 

RA Macfarlane J.R.,' Du J.-S., Pepys M.E., Ramsden S., Donnai D., 

RA Charlton R., Garrett C, Tolmie J., Yates J.R.W., Berry C, Goudie d., 

RA Moncla A., Lunt'P., Hodgson S., Jouet M., Kenwrick S.; 

RT "Nine novel Ll CAM mutations in families with X-linked 

RT hydrocephalus."; 

RL Hum. Mat. 9 : 512-518 ( 1997 ) . 

RN [18] 

RP VARIANTS HSAS/MASA ASP-691; ARG-698 AND PRO-935. 

RX MEDLINE-98180721; PubMed-9521424; 

RA Du Y,-z,, Sriva'stava A.K., Schwartz C.E.; 

RT "Multiple exon screening using restriction endonuclease 

RT fingerprinting (REF): detection of six novel mutations in the Ll cell 

RT adhesion molecule (LlCAM) gene."; 

RL Hum. Mutat. 11:222-230(1998). 

RN [19] 

RP VARIANT CRASH PRO-632, 

RX MEDLINE-98112489; PubMed-9452110; 

RA Vits I,, Chitayat D,, van Camp G., Holden J. J. A., Fransen E., 

RA Willems P.J.; , 

RT "Evidence for somatic and germline mosaicism in CRASH syndrome."; 

RL Hum. Mutat, Suppl, 1 : S284-S287 ( 1998) . 

RN [20] 

RP VARIANTS HSAS/MASA THR-219; ARG-335; CYS-386; CYS-473 AND LEO-1224, 

RX MEDLINE-98415726; PubMed-9744477; 

RA Saugier-Veber P;, Martin C, le Meur N., Lyonnet S., Munnich A., 

RA David A., Henocq A., Heron D., Jonveaux P., Odent S., Manouvrier S., 

RA Moncla A., Morichon N., Philip N. , Satge D., Tosi M. , Frebourg T. ; 

RT "Identification. of novel LlCAM mutations using fluorescence-assisted 

RT mismatch analysis."; 

RL Hum. Mutat. 12:259-266(1998). 

CC -!- FUNCTION: CELL ADHESION MOLECULE WITH AN IMPORTANT ROLE IN THE 

CC DEVELOPMENT: OF THE NERVOUS SYSTEM. INVOLVED IN NEURON -NEURON 

CC ADHESION, NEURITE FASCICULATION, OUTGROWTH OF NEURITES, ETC. BINDS 

CC TO AXONIN ON NEURONS. 

CC -I- SUBCELLULAR .' LOCAT ION : TYPE I MEMBRANE PROTEIN. 

CC -!- ALTERNATIVE ' PRODUCTS : TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 

CC PRODUCED BY'; DIFFERENTIAL SPLICING. 

CC -!* DISEASE: DEFECTS IN LlCAM ARE THE CAUSE OF THREE X-LINKED 

CC SYNDROMES. 1: HYDROCEPHALUS OWING TO STENOSIS OF THE AQUEDUCT OF 

CC SYLVIUS (HSAS) CHARACTERIZED BY MENTAL RETARDATION AND ENLARGED 

CC BRAIN VENTRICLES. 2: MASA SYNDROME WHICH IS CHARACTERIZED BY 

CC MENTAL RETARDATION, APHASIA, SHUFFLING GAIT, AND ADDUCTED THUMBS. 

CC HAS AN OVERLAPPING PROFILE OF CLINICAL SIGNS WITH HSAS, BUT WITH A 

CC MILDER PRESENTATION AND A LONGER LIFE EXPECTANCY. 3: SPASTIC 

CC PARAPLEGIA TYPE 1 (SPG1). COLLECTIVELY THESE SYNDROMES ARE ALSO 

CC KNOWN AS CRASH SYNDROME, AN ACRONYM WHICH STANDS FOR CORPUS 

CC CALLOSUM HYPOPLASIA, PSYCHOMOTOR RETARDATION, ADDUCTED THUMBS, 

CC SPASTIC PARAPARESIS, AND HYDROCEPHALUS. 

CC -I- DISEASE: DEFECTS IN LlCAM ARE THE CAUSE OF HIRSCHPRUNG DISEASE 

CC (HSCR) . 

CC -!- SIMILARITY: 'CONTAINS 6 IMMUNOGLOBULIN- LIKE C2-TYPE DOMAINS. 

CC -!- SIMILARITY: -CONTAINS 5 FIBRONECTIN TYPE III- LIKE DOMAINS , 

CC -I- DATABASE: NAME-L1CAM; NOTE-L1CAM mutation Web Page; 
CC WWW- "http : /^gins . uia . ac .be/dnalab/11 " , 

cc - 
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CC 
CC 
CC 
CC 
CC 
CC 
CC 

DR EMBL; X59847; CAA42508.1, 
DR EMBL; Z29373; CAA82564 .1, 
DR EMBL; M74387; AAA59476.1, 
DR EMBL; X58775; CAA41576 .1, 



Query Match 8.5%; Score 583.5; DB 1; Length 1257; 

Best Local Similarity 22.9%; Pred. No. 1.4e-24; 

Matches 243; Conservative 141; Mismatches 357; Indels 321; Gaps 42; 

•-N 77 



30 PVIIEH-PIDVWSRGSPATLNCGA--KPSTAKITWYKDG— QPVITNREQV" 

III 1:11 :| I I II : I :ll :| l|:: : 
35 PVITEQSPRRLWFPTDDISLKCEASGKPE-VQFRWTRDGVHFKP — REELGVTVYQS 89 

78 SHRIVLDTGSLFLLKVNSGRNGKDSDAGAYYCVASNEHGEVKSNEGSLRLAMLREDFRVR 137 

I :ll : II I III III: I 1:1 : :: I 
90 PH SGSFTITGNNS - - NFAQRFQGIYRCFASNRLGTAMSHE — IRLMAEGAPRW 138 



Qy 138 P RTVQALGGEMAVLECSPPRGFPEPWSWRRDDRELRIQDMPRYTLHSDGNLIIDP 193 

I : I: II II hi ||: : : I I |: | |: :||| 
Db 139 PKETVKPVEVEEGESWLPCNPPPS-AEPLRIYWMNSKILHIKQDERVTMGQNGNLYFAN 197 



I 

Qy 
Db 
Qy 
Db 
Qy 

Db 
Qy 
Db 
Qy 

Db 



194 VDRSDS-GTYQCVANNMVGERVSNPARLSVFEKPKFEQEPKDMTVDV 239 

I II: I II: I :: :l II |: I 
198 VLTSDNHSDYICHAH FPGTRTIIQK EP IDLRVKATNSMI DRKPRLLF 244 

240 GAAVLFDCRVTGDPQPQITWKRRNEPMPVTRAYIAKDNRGLRIERVQP 287 

I :: : I I I I I I : III I |: |:: :l 
245 PTNSSSHLVALQGQPLVLECIAEGFPTPTIRWLRPSGPMPADRVTYQNHNKTLQLLKVGE 304 

288 SDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTRPADQSVPAGGTATFECTLVGQPSPAY 347 

hill I I I I: : :: |:| I : II III :| : hi I 
305 EDDGEYRCLAENSLGSARHAYYVTVEAAPYWLHKPQSHLYGPGETARLDCQVQGRPQPEV 364 

348 FWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSRAAL 407 

II : :: | : :: | | : h I 

355 TWRING — IPVEEIARDQRYRIQ- RGALILSNVQPSDTMVTQCEARNRHGLLLANAYI 419 

408 KATFETKGRVQKKKSKMGKQKQKNVQSIIRYLISAVTGNTPAKPPPTIEHGHQNQTLMV- 466 

h: III Ml I 

420 YW QLPARILTA DNQTYMAV 439 

467 -GSSAILPCQASGKPTPGISWL-RDGLPIDITDSRISQHSTGSLHIADLRKPDTGVYTCI 524 

hi I hi I I I : II II : : I I :: hi I lh III I h 
440 QGSTAYLLCKAFGAPVPSVQWLDEDGTTV-LQDERFFPYANGTLGIRDLQANDTGRYFCL 498 

525 ARNEDGESTWSASLTVEDHTSNAQFVRMP DPSNFPS 560 

I h I hi hi I I I III II 

499 AANDQNNVTIMAHLRVRDATQITQGPRSTIERRGSRVTFTCQASFDPSLQPSITWRGDGR 558 



-SPTQPIIVN" 
I I ::! 



■ 569 



570 VTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLG-QTWFN— IPDYVASTEY 618 

:| ::l : I :h II I h: :: : |:: :| II 
619 VLSDLHLLTQSQVRVSW-SPAEDHNAPIERYDIEFEDREMAPERWYSLGRVPGNQTSTTL 677 

619 RIRGLRPSHSYMFVIRAENERGIGTPSVSSALVTTSRPAAQVALSDRNRMD 669 

: I I I I : I I: II II I I I : | ;|| :| 
678 R - ' 'LSPYVHYTFRVTAINKYGPGEPSPVSETWTPE AAPERNPVDVRGEGNETT 729 

670 -MAIAEKRLT SEQLI 683 

I I I I II:: 
730 NMVITWKPLRWMDWNAPQVQYRVQWRPQGTRGPWQEQIVSDPFLWSNTSTFVPYEIKVQ 789 



Qy 684 RLEEVKTINSTAVRLFWRKRRLEEL- - - IDGYY 713 

:|| :: :lhll : I: I :: : lh 
Db 790 AVNSQGRGPEPQVTIGYSGEDYPQAIPEIEGIEILNSSAVLVRWRPVDLAQVRGHLRGYN 849 

Qy 714 IK-WR-GPPRTNDNQYVN-— VTSPSTENYWSNLMPFTNYEFFVIPYHSGVHSIHGAP 767 

: II I I : :::: I :| : ::| I I::: II I : :| 
Db 850 VTYWREGSQRKHSRRHIHRDHVWPANTTSVILSGLRPYSS YHLEVQAFNGRG 902 

Qy 768 SNSMDVLTAEAPPSLP--PEDVRIRMLNLTTLRISWRAPRADGINGILRGFQIVIVGQAP 825 

I | ;| :| || ; : : hi : h I : Ihl h : 
Db 903 SGPASEFTFSTPEGVPGHPEALHLECQSNTSLLLRWQPPLSH-NGVLTGYVLSYHPLDE 960 

Qy 826 NNNRNITTNERAASV— TLFHLVTGMTYKIRVAARSNGGVG 864 

:: I I' : II : h :: I : I I 
Db 961 GGKGQLSFNLRDPELRTHNLTDLSPHLRYRFQLQATTKEGPG 1002 



RA 



RT 



STANDARD; PRT; 1040 AA. 



RESULT 11 
AX01JUMAN 
ID AXOIJUMAN 

Q02246; ) 
01-JUL-1993 (Pel. 26, Created) 
01-JUL-1993 (Rel. 26, Last sequence update) 
15-JUL-1999 (Rel. 38, Last annotation update) 
AXONIN-1 PRECURSOR (AXONAL GLYCOPROTEIN TAG-1) (TRANSIENT AXONAL 
GLYCOPROTEIN 1); 
TAXI OR TAG1. : 
Homo sapiens (Human). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
[1] 

SEQUENCE FROM N.A, 
TISSUE-BRAIN; ■ 

KEDLINE-93145965; PubMed-8425542; 

Hasler T.H., Rader C, Stoeckli E.T., Zuellig R.A., Sonderegger P.; 
"cDNA cloning, structural features, and eucaryotic expression of 
human TAG-l/axouin-1."; 
Eur. J, Biochem'; 211:329-339(1993). 
[2] 

SEQUENCE FROM N.A. 
TISSUE-BRAIN; 

MEDLINE-94140354; PubMed-8307567; 
Tsiotra CP., Raragogeos D., Theodorakis R., Michaelidis M.T., 
Modi W.S., Furley J.A., Jessel M.T., Papamatheakis J,; 
"Isolation of the cDNA and chromosomal localization of the gene 
(TAXI) encoding the human axonal glycoprotein TAG-1,"; 
RL Genomics 18:562-567(1993). 
CC -I- FUNCTION: MAY PLAY A ROLE IN THE INITIAL GROWTH AND GUIDANCE OF 
CC AXONS. MAY BE INVOLVED IN CELL ADHESION. 
CC -1- SUBCELLULAR . LOCATION : ATTACHED TO THE NEURONAL MEMBRANE BY A 
CC GPI -ANCHOR AND IS ALSO RELEASED FROM NEURONS. 
CC -I- SIMILARITY: .CONTAINS 6 IMMUNOGLOBULIN-LIRE C2-TYPE DOMAINS. 
CC -I- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 
CC between the Swiss Institute of Bioinformatics and the EMBL outstation • 
CC the European Bioinformatics Institute. There are no restrictions on its 
CC use by non-profit institutions as long as its content is in no way 
CC modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to licenseSisb-sib.ch) . 
cc 

DR EMBL; X68274; CAA48335.1; -. 

DR EMBL; X67734; CAA47963.1; -. 

DR PIR; S28830; S2B830. 

DR KIM; 190197; -. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3j 4. 

DR PFAM; PF00047; ig; 6. 

RW Immunoglobulin domain; Glycoprotein; Signal; GPl-anchor; 

RW Cell adhesion; Repeat. 

FT SIGNAL 1 28 
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FT 


CHAIN 


29 


1012 


AXONIN-1. 


FT 


PROPEP 


1013 


1040 


REMOVED IN MATURE FORM (POTENTIAL) . 


FT 


DOMAIN 


54 


118 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


148 


216 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


254 


313 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


341 


402 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


433 


495 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


523 


594 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


606 


612 


GLY/PRO-RICH. 


FT 


DOMAIN 


611 


706 


FIBRONECTIN TYPE-III. 


FT 


TVMM TXT 


714 


809 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


816 


908 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


917 


1003 


FIBRONECTIN TYPE-III. 






794 


796 


CELL ATTACHMENT SITE (BY SIMILARITY) 


FT 


trtKBUHIU 


76 


76 


KT-T TMVPH IPTOVlAfi \ f OTITIC MfT* TAT * 

N-LlNKkD (ULINAU , .) (POTtNTIAL) . 


FT 


PAPROHVn 


198 


198 


M-T TMVT?A /PTPMft/ 1 V /IWPBWTITM \ 

W LlNRliL) (bU-Mt. , .} (rUlbNiiAL) . 


FT 


yAKUUHiU 


204 


204 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




CARBOHYD 


461 


461 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


tr 


CARBOHYD 


477 * 477 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


f FT 


CARBOHYD 


498 


498 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


525 


525 


N-LINKED (GLCNAC, , .) (POTENTIAL), 


FT 


CARBOHYD 


830 


830 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


918 


918 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


940 


940 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


LIPID 


1012 


1012 


GPI -ANCHOR (POTENTIAL) . 


SQ 


SEQUENCE 


1040 AA; 113393 MW; 254E78DD3C28EFB6 CRC64; 



Query Match 8.5%; Score 582.5; DB 1; Length 1040; 

Best Local Similarity 26.7%; Pred, No. 1.2e-24; 

Matches 220; Conservative 107; Mismatches 349; Indels 149; Gaps 32; 

Qy 79 HRIVLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLR— EDFR 135 

I : IN:: : |: ! I | |:|:: : : | | | | 

Db 183 HFVSQTTGNLY I ARTNA SDLGNYSCLATS - HMDFSTKSVFSKFAQLNLAAEDTR 235 

Qy 136 V RPRTVQALGGEMAVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDG 187 

: I II I: III MIMIIII |::| :: 

Db 236 LFAPS IKARFPAET YALVGQQVTLECF ■ AFGNPVPRIKWRKVDGSL — SPQWTT - AEP 289 

Qy 188 NLIIDPVDRSDSGTYQCVANNMVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDC 247 

I I I I II: MM: |M :h: : I hh : : I 

Db 290 TLQIPSVSFEDEGTYECEAENSKG-RDTVQGRIIVQAQPEWLKVISDTEADIGSNLRWGC 348 

Qy 248 RVTGDPQPQITWKRKNEPMPVTRAY IAKDNR GLRIERVQPSDEGEYVCYARNPA 301 

L I 1:1 : II II :| II II :: I I I II I 

|Db 349 AAAGKPRPTVRWLRNGEP LASQNRVEVLAGDLRFSKLSLEDSGMYQCVAENKH 401 

Qy 302 GTLEASAHLRVQA-PPSFQTKPADQSVPA-GGTATFECTLVGQPSPAYFWSKEGQQDLL 358 

II: III I III II: I : :ll II I I III I : I: 
Db 402 GTIYASAELAVQALAPDFRLNPVRRLIPAARGGEILIPCQPRAAPKAWLWSK-GTEILV 460 

Qy 359 FPSYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQ 418 

I I 1:1 MM :: Ml I I M I I II 
Db 461 NSS RVTVTPDGTLIIRNISRSDEGKYTCFAENFMGKANSTGILSVRDATK— - 510 

Qy 419 KKKSKMGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASG 478 

I : : :| : II || 
Db 511 ITLAPSSADINLGDNLTLQCHASH 534 

Qy 479 KPTPGI-SWLRDGLPIDITD SRISQHST - GS LHI ADLKKPDTGVYTC IAKNEDG 530 

II : :| I 'III I : III:: I |||:|: 

Db 535 DPTMDLTFTWTLDDFPIDFDKPGGHYRRTNVKETIGDLTILNAQLRHGGKYTCMAQ 590 



Qy 531 ESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPI 590 





IN 1 1 : 1 1 1 :: :: || ::| |: 1 


FT 


SIGNAL 


1 


19 




Db 


591 TWDSASKEATVLVRGP— PGPPGGVWRDIGDTTIQLSWSR-GFDNHSPI 638 


FT 


CHAIN 


20 


1260 


NEURAL CELL ADHESION MOLECULE Ll 






FT 


DOMAIN 


20 


1123 


EXTRACELLULAR (POTENTIAL). 


Qy 


591 TGYIIQYYSPDLGQTWFNIPDYVASTE YRIKGLKPSHSYMFVIRAENEKGIGTPS 645 


FT 


TRANSMEM 


1124 


1146 


POTENTIAL, 




1 :| :! 1: 1 : |: I :: III II M 1 1 1 II 


FT 


DOMAIN 


1147 


1260 


CYTOPLASMIC (POTENTIAL) , 


Db 


639 AKYTtQARTPPAGK-WKQVRTNPANIEGNAETAQVLGLTPWMDYEFRVIASNILGTGEPS 697 


FT 


DOMAIN 


50 


120 


IG-LIKE C2-TYPE DOMAIN. 






FT 


DOMAIN 


15.0 


215 


IG-LIKE C2-TYPE DOMAIN. 


Qy 


646 VSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEEVKTINSTAVRLFWKKRKL 705 


FT 


DOMAIN 


256 


318 


IG-LIKE C2-TYPE DOMAIN. 



I: : I : I II I : II 
Db 698 GPSSKIRTRE AAPSVAPSGLSGGGGAPGE 



: I I : :: 
--VNWTPMSREYQNG- 741 



Qy 706 EELID- -GYYIKWRGPPRTNDNQYVNVTSPSTENYWSN- -IjMPFTNYEFFVIPYHSGVH 761 

I \\ : :| I: I I : :| II : M :| : |: 
Db 742 — -DGFGYLLSFRRQGSTH-WQTARVPGADAQYFVYSNESVRPYTPFEVKIRSYN--R 793 

Qy 762 SIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWKAPKADGINGILKGFQIVI- 820 

I I : I : 1 1 Ml I : :: : : ::|: I :|||l |::| 
Db 794 RGDGPESLTALVYSAEEEPRVAPTKVWAKGVSSSEMNVTWE-PVQQDMNGILLGYEIRYW 852 

Qy 821 -VGQAPNNNRNITTNERAASVTLFHLVTGMTYKIRVAARSNGGVG 864 

I : I 1:1 I : I M I I 

Db 853 KAGDKEAAADRVRTAGLDTSARVSGLHPNTKYHVTVRAYNRAGTG 897 



RESULT 12 
CAMLJIOUSE 

ID CAMLJIOUSE ' STANDARD; PRT; 1260 AA. 
AC P11627; 

DT 01-OCT-1989 (tel. 12, Created) 

DT 01-OCT-1989 (tel. 12, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEURAL CELL ADHESION MOLECULE Ll PRECURSOR (N-CAM Ll) . 

GN L1CAM OR CAMLL 

OS Mus musculus (Mouse) . 

OC Eukaryota; Met'azoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
RN [1] 

RP SEQUENCE FROM N.A., AND PARTIAL SEQUENCE, 

RC TISSUE-BRAIN; .' 

RX MEDLINE-88318924; PubMed-3412448; 

RA Moos M., Tacke R., Scherer H., Teplow D., Frueh K. ( Schachner M.; 
RT "Neural adhesion molecule Ll as a member of the immunoglobulin 
RT superfamily with binding domains similar to fibronectin."; 
RL Nature 334 ;701'-703 (1988), 

-!- FUNCTION: CELL ADHESION MOLECULE WITH AN IMPORTANT ROLE IN THE 
DEVELOPMENT OF THE NERVOUS SYSTEM, INVOLVED IN NEURON -NEURON 
ADHESION, NEURITE FASCICULATION, OUTGROWTH OF NEURITES, ETC. BINDS 
TO AXONIN ON NEURONS. 
-!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 
-I- ALTERNATIVE . PRODUCTS : TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 

PRODUCED BY DIFFERENTIAL SPLICING (BY SIMILARITY) . 
-!- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
-!- SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS. 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://wwv.isb-sib.ch/announce/ 
or send an email to licenseSisb-sib.ch), 

EMBL; X12875; |CAA31368.1; -. 
PIR; S05479; S05479. 
HSSP; P20241; ; 1CFB. 
MGD; MGI: 96721; L1CAM. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 4, 
PFAM; PF00047;, ig; 6. 
PRINTS; PR00014; FNTYPEIII. 

Cell adhesion;- Glycoprotein; Transmembrane; Repeat; Brain; 
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FT 


DOMAIN 


346 


410 


16 -LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


440 


503 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


531 


599 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


827 


896 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


932 


994 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


1032 


1094 


FIBRONECTIN TYPE-III. 




FT 


SITE 


553 


555 


CELL ATTACHMENT SITE 


POTENTIAL) . 


FT 


SITE 


562 


564 


CELL ATTACHMENT SITE 


POTENTIAL) , 


FT 


CARBOHYD 


100 


100 


N-LINKED (GLCNAC. . . 


(POTENTIAL). 


FT 


CARBOHYD 


202 


202 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


246 


246 


N-LINKED (GLCNAC. , . 


(POTENTIAL). 


FT 


CARBOHYD 


293 


293 


N- LINKED (GLCNAC, . . 


(POTENTIAL) . 


FT 


CARBOHYD 


432 


432 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


478 


478 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


489 


489 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


504 


504 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


587 


587 


N* LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


670 


670 


N-LINKED (GLCNAC, , , 


(POTENTIAL) . 


FT 


CARBOHYD 


725 


725 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


776 


776 


N-LINKED (GLCNAC, , , 


(POTENTIAL) . 


FT 


CARBOHYD 


824 


824 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 




CARBOHYD 


848 


848 


N-LINKED (GLCNAC, . , 


(POTENTIAL) . 


FT 


CARBOHYD 


875 


875 • 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


968 


968 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


978 


978 


N-LINKED (GLCNAC. . . 


(POTENTIAL) , 


FT 


CARBOHYD 


1022 


1022 


N-LINKED (GLCNAC. . . 


■(POTENTIAL) . 


FT 


CARBOHYD 


1030 


1030 


N-LINKED (GLCNAC. . . 


*( POTENTIAL) , 


FT 


CARBOHYD 


1073 


<1073 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


CARBOHYD 


1107 


1107 


N-LINKED (GLCNAC. . . 


(POTENTIAL) . 


FT 


VARSPLIC 


1180 


1183 


MISSING (IN SHORT ISOFORM) 


FT 








(BY SIMILARITY). 




SQ 


SEQUENCE 


1260 AA; 140968 MW; 22BE57001CB2A538 CRC64; 



Query Match 8.34; Score 567.5; DB 1; Length 1260; 

Best Local Similarity 21.84; Pred. No. le-23; 

Matches 271; Conservative 173; Mismatches 470; Indels 327; Gaps 51; 

Qy 30 PVIIEH-PIDWVSRGSPftTLNCGAK-PSTAKITWYKDG — QPVITNKEQVNSHRIVLD 84 

Ml I I :li :l II: I Ml. :| ||:: :|: 

Db 35 PVITEQSPRRLWFPTDDISLKCEARGRPQVEFRWTKDGIHFKP-— KEELG--VWH 87 

Qy 85 - - -TGSLFLLKVNSGKNGKDSDA- - - -GAYYCVASNEHG£VKSNEGSLKLAMLREDFRV 136 

:!l : I :| I III III: if |:| : :: I 
Db 88 EAPYSGSFTI EGNNSFAQRFQG I YRC YASNKLGTAMSHE IQLVAEGAPK 136 

Qy 137 RP— -RTVQALGGEMAVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIID 192 

I : I: II II 1:11 1:1:1'!: I :: :|:| 
Db 137 WPKETVKPVEVEEGESWLPCNPPPSAAPPRIYW-MNSKIFDIKQDERVSMGQNGDLYFA 195 

Ky 193 PVDRSDS GT YQCVANNMYGERVSNPARLSVFEKPK • - FEQEPKDMTVDV 239 

f III: || ; : || | : ||: | | : 

Db 196 NVLTSDNHSDYICNAHFPGTRTIIQKEPIDLRV-KPTNSMIDRKPRLLFPTNSSSRLVAL 254 

Qy 240 -GAAVLFDCRVTGDPQPQITWKRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYAR 298 

I ::: :| I I I I I ::||l I |: |:: I MM I I 
Db 255 QGQSLILECIAEGFPTPTIKWLHPSDPMPTDRVIYQNHNKTLQLLNVGEEDDGEYTCLAE 314 

Qy 299 NPAGTLEASAHLRVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLL 358 

II: : :: hi I : II I II :| : |:| I I I : 
Db 315 NSLGSARHAYYVTVEAAPYWLQKPQSHLYGPGETARLDCQVQGRPQPEITWRING— -M 370 

Qy 359 FPSYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQ 418 

1:1::: hi : |: I I I I |: I : : h 
Db 371 SMETVNKDQKYRIE-QGSLILSNVQPTDTMVTQCEARNQHGLLLANAYIYW-QLPARIL 428 

Qy 419 KKKSKMGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMV--GSSAILPCQA 476 

I III I Ihl I 'hi 

Db 429 TK DNQTYMAVEGSTAYLLCKA 449 

Qy 477 SGKPTPGISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSA 536 

I I I : II : : I I : : h I I 1 1 : 1 1 1 I I I I : II 
Db 450 FGAPVPSVQWLDEEGTTVLQDERFFPYANGTLSIRDLQANDTGRYFCQAANDQNNVTILA 509 



Qy 537 SLTVEDHTSNAQFVRMP DPS 556 

:! h: I- ;'l I III 
Db 510 NLQVKEATQITQGPRSAIEKKGARVTFTCQASFDPSLQASITWRGDGRDLQERGDSDKYF 569 

Qy 557 NFP SSPTQPIIVN VTDTE 574 

I: II :;l : :: 

Db 570 IEDGKLVIQSLDYSDQGNYSCVASTELDEVESRAQLLWGSPGPVPHLELSDRHLLKQSQ 629 

Qy 575 VELHWNAPSTSGAGPITGYIIQYYSPDLG-QTWFN— IPDYVASTEYRIKGLKPSHSYM 630 

I I I :h. II I h: '•: : lh :| II : I I I 
Db 630 VHLSW-SPAEDHNSPIEKYDIEFEDKEMAPEKWFSLGKVPGNQTSTTLK--LSPYVHYT 685 

Qy 631 FVIRAENEKGIGTPS-VSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEEVK 689 

I : I h I I II II ::ll I :l! :|: I h :| : :: 

Db 686 FRVTAINKYGPGEPSPVSESWTPE AAPEKNPVDVR - GEGNETNNMVITWKPLR 738 

Qy 690 TINSTAVRLMKRKLEELIDGYYIKWRGPPRTNDNQYVNVTSPSTENYWSNLMPFTNY 749 

:: 'I ::: I "II : : h I Mil I I 
Db 739 WMD yWNAPQIQ YRVQWRPQGKQETWRKQTVSDPF- - -LWSNTSTFVPY 783 

Qy 750 EFFVIPYHSGVHSIHGAPSNSMDV-LTAEAPPSLPPEDVRIRMLNLTTLRISWKAPKADG 808 

II h: I : : : I I : II I : I :h : I: 

Db 784 EIKV — QAVNNQGKGPEPQVTIGYSGEDYPQVSPELEDITIFNSSTVLVRWRPVDLAQ 839 

Qy 809 INGILKGFQIVI— VGQAPNNNRNITTNE RAASVTLFHLVTGMTYKIRVAARSN 860 

: I III: :: I :: hi : I I I :| : I I : 

Db 840 VKGHLKGYNVTYWWKGSQRKHSKRHIHKSHIWPANTTSAILSGLRPYSSYHVEVQAFNG 899 

Qy 861 GGVGVSH"- GTSEVIMNQDTLEKHLAAQQENESFLYGLINKSHVPVIVIVA 909 

1:1 : • ' II: I! I : h II h 

Db 900 RGLGPASEWTFSTPEGVPGHPEAL HLECQSDTSLLLHWQPPLSHNGVLT - • ■ 948 

Qy 910 ILIIFWIIIAYCYWFNSRNSDGKDRSFIKINDGSVHMASNNLWDVAQNPNQNPMYNTAG 969 

■I : : h: I ::| : ::|| I II 
Db 949 \ -GYLLSYHPVEGESKEQLFFNLSD- -PELRTHNL TNLNP 984 

Qy 970 RMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTGGPGNA 1023 

: I : I I MM I 

Db 985 DLQYRFQLQATTQQ GGPGEAIVREGG 1010 

Qy 1024 -MSTF: - - - YGNQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPTNPVPP ■ - EPPAR 1076 

I: I -:ll I: : I . I : : : hi I I :| : 

Db 1011 TMALFGRPDFGNISATAGENYSWSWVPRRGQCNFRFHILFR-ALPEGKVSPDHQPQPQ 1068 

Qy 1077 YADHTAGRRSRSSRASDGRGTLN GGLHH— RTSGS 1109 

I : ,:: : I : :: III ;|:|: 
Db 1069 YVSYNQSSYTQWNLQPDTKYEIHLIKEKVLLHHLDVKTNGT 1109 



RESULT 13 
AXOl.CHICK 

ID AXOLCHICK STANDARD; 
AC 
DT 
DT 
DT 



PRT; 1036 AA. 



01-DEC-1992 (Rel. 24, Created) 
01-DEC-1992 (Rel. 24, Last sequence update) 
15-JUL-1999 (Rel. 38, Last annotation update) 
DE AXONIN-1 PRECURSOR. 
OS Gallus gallus- (Chicken). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostorai; 
OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 
OC Gallus. 
RN [1] 

RP SEQUENCE FROM, N, A., AND PARTIAL SEQUENCE. 
RC TISSUE=BRAIN; 

n MEDLINE-92174898; PubMed-1311675; 

RA Zuellig R.A., Rader C, Schroeder A., Kalousek M.B., 

RA von Bohlen Und Halbach F., Qsterwalder T., man C, Stoeckli E.T., 

RA Affolter H.-U,, Fritz A,, Hafen E. , Sonderegger P.; 

RT "The axonally_ secreted cell adhesion molecule, axonin-1, Primary 

RT structure, immunoglobulin-like and f ibronect i n - type -Ill-like domains 

RT and glycosyl-phosphatidylinositol anchorage,"; 

RL Eur. J. Biochem. 204:453-463(1992). 
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-!- FUNCTION: AXON-ASSOCIATED CELL ADHESION MOLECULE (AXCAM) WHICH 


cc 


PROMOTES NEDRITE OUTGROWTH BY INTERACTION WITH THE AXCAM LI (G4) 


cc 


OF NEURITIC m 


JfflRANE 






-!- SUBCELLULAR LOCATION: ATTACHED TO THE NEURONAL MEMBRANE BY A 


cc 


GPI* 


\NCHOR. 






cc 


-!- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 


cc 
cc 
cc 


-!- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS . 


This SWISS-PROT entry is 


copyright. It is produced through a collaboration 


cc 


between 


the Swiss institute of Bioinformatics and the EMBL outstation - 


cc 


the European Bioinformatics Institute. There are no restrictions on its 


cc 


use by 


non-profit institutions as long as its content is in no way 


cc 


modified and this statement is not removed. Usage by and for commercial 


cc 


entities requires a license agreement (See http://wvw.isb-sib.ch/announce/ 


cc 
cc 

DR 

I 
1 


or send an email to license@isb-sib.ch) . 


EMBL; X63101; CAA44815.1 
PIR; S22128; S22128. 
PIR; S22383; S22383. 
HSSP; P56276; 1TLK. 






INTERPRO 


IPR001777; -. 




np 


INTERPRO; IPR003006; -. 




no 


PFAM; PF00041; fn3 


; 3. 




nc 


PFAM; PF00047; ig, 


6, 




ra 


Immunoglobulin domain; Glycoprotein; Signal; GPI-anchor; 


TO 


Cell adhesion; Rep 


eat. 




FT 


SIGNAL 


1 


23 


OR 25 (POTENTIAL) . 


FT 


CHAIN 


24 


1036 


AXONIN-1. 


FT 


PROPEP 


? 


1036 


REMOVED IN MATURE FORM. 




DOMAIN 


49 


113 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


143 


211 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


249 


308 


IG-LIKE C2-TYPE DOMAIN. 




DOMAIN 


336 


397 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


428 


490 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


518 


589 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


599 


608 


HINGE (POTENTIAL), 


FT 


DOMAIN 


601 


607 


GLY/PRO-RICH. 


FT 


DOMAIN 


608 


709 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


710 


811 


FIBRONECTIN TYPE- III, 


FT 


DOMAIN 


812 


912 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


913 


1009 


FIBRONECTIN TYPE- III, 


FT 


HODJfflS 


724 


?24 


BLOCKED. 


FT 


CARBOHYD 


71 


71 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


199 


199 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


456 


456 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




CARBOHYD 


472 


472 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


J 


CARBOHYD 


493 


493 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




CARBOHYD 


520 


520 


N- LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CARBOHYD 


770 


770 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


900 


900 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


914 


914 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


SQ 


SEQUENCE 


1036 AA; 113301 MW; 08B80143BE779794 CRC64; 



Query Match 8.2%; Score 562.5; DB 1; Length 1036; 

Best Local Similarity 23.3%; Pred. No. 1.5e-23; 

Matches 229; Conservative 113; Mismatches 371; Indels 271; Gaps 34; 

Qy 30 PVI IEHPIDVWSRGS • - -PATLNCGAK • - - PSTAKITWYKDGQPVITNKEQVNSHRIVL 83 

II I I : II III I: hi : I :| I : I I 
Db 32 PVFEEQPAHTLFPEGSAEEKVTLTCRARANPPATYR-WKMNG-— TELKMGPDSRYRL 85 

Qy 84 DTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLRE 132 

II: I llhl H:l I I I I II: |:| 

Db 86 VAGDLVI SNPVKAKDAGSYQCVATNARGTWSREASLRFGFLQEFSAEERDPVKI 140 

Qy 133 132 

Db 141 TEGWGVMFTCSPPPHYPALSYRWLLNEFPNFIPADGRRFVSQTTGNLYIAKTEASDLGNY 200 

Qy 133 DFRVR PRTVQALGGEMAVLECSPPR 157 

II : III hi III 

Db 201 SCFATSHIDFITKSVFSKFSQLSLAAEDARQYAPSIKAKFPADTYALTGQMVTLECF-AF 259 



Qy 158 GFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANNMVGERVSNP 217 

111:1111 I I: I I II I llhl | |: I I : 

Db 260 GNPVPQIKWRKLD GSQTSKWLSSEPLLHIQNVDFEDEGTYECEAENIKG-RDTYQ 313 

Qy 218 ARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMPVTRAYIAKDN 277 

I: : :l : I hh : : I : I I : I : I I : I :| I 

Db 314 GRIIIHAQPDWLDVITDTEADIGSDLRWSCVASGKPRPAVRWLRDGQP LASQN 366 

Qy 278 R GLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQA-PPSFQTKPADQSVPA- 329 

I II' :: I I I I I I II: III I III I |: I : :|| 

Db 367 RIEVSGGELRFSKLVLEDSGMYQCVAENKHGTVYASAELTVQALAPDFRLNPVKRLIPAA 426 

Qy 330 -GGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVRQVDEG 388 

I I ■ I 1:1 : :: I :: III :: : : III 

Db 427 RSGKVI IPCQPRAAPKATVLWTKGTE LLTNSSRVTITADGTLILQNISKSDEG 479 

Qy 389 AYVCAGMNSAGSSLSKAALKATFETKGRVQKKKSKMGKQKQKNVQSIIKYLISAVTGNTP 448 

I I M : I I II 

Db 480 KYTCFAENFMGKANSTGILSVRDATK 505 

Qy 449 AKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGI--SWLRDGLPIDITDS RIS 501 

I • : II : I I II II : :| I III: I I I 

Db 506 ITLAPSSADINVGENLTLQCHASHDPTMDLTFTWSLDDFPIDLDKSEGHYRRAS 559 

Qy 502 -QHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPS 560 

: : I I I- : :| III |: II Ml : : : I I 

Db 560 VKEAVGDLAIVNAQLKHSGRYTCTAQ TWDSTSESATLTVRGP- - -PG 604 

Qy 561 SPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTE- - - 617 

I :: :: II hi |: II I h : I I : : I 

Db 605 PPGGVWRDIGDTTVQLSWSR-GFDNHSPIARYSIEARTL-LSNKWKQMRTNPVNIEGNA 662 

Qy 618 "YRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEK 675 

:: I I -I I : I I hi II: h : I : III I : 

Db 663 ETAQWNLIPWMDYEFRVLASNILGVGEPSLPSSKIRTKEAAPTVAPS GLGGG 715 



676 RLTSEQLIKLEEVKTINSTAVRLFWKKRKLEELID GYYIKWRGPPRTNDNQYV 728 

: 1 1 III II II : :| : :: 

716 GGAPNELI- — — INWTPT LRDYQNGDGFGYILSFR— KKGTQGWL 754 

729 NVTSPSTE- -NYWSN- -LMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPP 784 

I I :ll I : hi :| : h I I : I :|| I : I 
755 TARVPHAESLHYVYRNESIGPYTPFEVKIKAYN--RKGEGPESLTAIVYSAEEEPKVAP 811 

785 EDVRIRMLNLTTLRISWKAPKADGINGILKGFQIVIVGQAPNNNRNITTNERAASVTLFH 844 

I : : : : :lh : : hi |::| : I I : 

812 FRVTAKAVLSSEMDVSWEPVEQGDMTGVLLGYEIRY - -WKDGDKEEAADRVRT AGLVTSA 869 

845 LVTGMT — YKIRVAARSNGGVG 864 

III: I 1 : II : II 
870 HVTGLNPNT K YHVSVRAYNRAGAG 893 



RESULT 14 
NRCA_CHICK 
ID 
AC 
DT 
DT 
DT 
DE 
OS 
OC 
OC 
OC 



STANDARD; 



PRT; 1284 AA. 



NRCA_CHICK 
P35331; 

01-FEB-1994 (Rel. 28, Created) 
01-FEB-1994 (Rel: 28, Last sequence update) 
15-JUL-1999 (Rel; 38, Last annotation update) 
NG-CAM RELATED CELL ADHESION MOLECULE PRECURSOR (NR-CAM) (BRAVO). 
Gallus gallus (Chicken). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 
Gallus. 

[1] 

SEQUENCE FROM N.A., AND SEQUENCE OF 25-52; 178-184 AND 581-594. 
STRAIN-WHITE LEGHORN; TISSUE=EMBRYONIC BRAIN; 
MEDLINE-912584 07 ; PubMed=2045418 ; 

Grumet M., MauroV., Burgoon M.P., Edelman G.M., Cunningham B.A. ; 
"Structure of a new nervous system glycoprotein, Nr-CAM, and its 
relationship to subgroups of neural cell adhesion molecules."; 



Mon Jan 22 13: 04; 35 2001 



us-09-540-245a-17.rsp 
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J, Cell Biol. 113:1399-1412(1991), 
[2] 

SEQUENCE OF 25-1284 FROM N.A., AND PARTIAL SEQUENCE. 
TISSUE-EMBRYONIC BRAIN, AND RETINA; 
MEDLINE=923 81110; PubMed=1512296; 

Kayyem J.F., Roman J.M., de la Rosa E.J., Schwarz 0., Dreyer W.J.; 
"Bravo/Nr-CAM is closely related to the cell adhesion molecules LI 
and Ng-CAM and has a similar heterodimer structure."; 
J. Cell Biol. 118:1259-1270(1992), 

-!- FUNCTION: THIS PROTEIN IS A CELL ADHESION MOLECULE INVOLVED IN 
NEURON- NEURON ADHESION, NEURITE FASCICULATiQN, OUTGROWTH OF 
NEURITES, ETC, SPECIFICALLY INVOLVED IN THE DEVELOPMENT OF OPTIC 
FIBRES IN THE RETINA. 

-!- SUBUNIT: HETERODIMER, COMPOSED OF AN ALPHA AND A BETA CHAIN. 

-!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

-!- ALTERNATIVE PRODUCTS: AT LEAST 5 ISOFORMS ARE PRODUCED BY 
ALTERNATIVE SPLICING. 

-!- TISSUE SPECIFICITY: RETINA AND DEVELOPING BRAIN. ^ 

-!• DEVELOPMENTAL STAGE: EXPRESSED IN DEVELOPING NEURAL RETINA AND 
EMBRYONIC BRAIN TISSUE. t 

-I- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN'LIKE C2-TYPE DOMAINS. 

-!- SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no i restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://wwwdsb-sib.ch/announce/ 
or send an email to license@isb-sib.ch) . 

EMBL; X58482; CAA41391.1; -. ^ 

EMBL; L08960; AAA48632.1; -. 

HSSP; P20241; 1CFB, 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003006; -. 

PFAM; PF00041; fn3; 5. 

PFAM; PF00047; ig; 6. 

PRINTS; PR0QQ14; FNTYPEIII, 

Immunoglobulin domain; Glycoprotein; Signal; Cell adhesion; Repeat; 
Transmembrane; Alternative splicing. 



FT 


SIGNAL 


1 


24 






FT 


CHAIN 


25 


1284 


NG-CAM RELATED CELL ADHESION MOLECULE. 


FT 


DOMAIN 


25 


1143 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


1144 


1166 


POTENTIAL. 




FT 


DOMAIN 


1167 


1284 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


56 


125 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


155 


220 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


261 


323 


IG-LIKE C2-TYPE DOMAI 


N. 




DOMAIN 


351 


415 


IG-LIKE C2-TYPE DOMAIN, 


I 


DOMAIN 


445 


508 


IG-LIKE C2-TYPE DOMAIN. 




DOMAIN 


536 


599 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


638 


699 


FIBRONECTIN TYPE'III 




FT 


DOMAIN 


738 


799 


FIBRONECTIN TYPE-III 




FT 


DOMAIN 


837 


906 


FIBRONECTIN TYPE-III 




FT 


DOMAIN 


943 


1006 


FIBRONECTIN TYPE-III 




FT 


DOMAIN 


1057 


1114 


FIBRONECTIN TYPE-III 




FT 


DISULFID 


63 


118 


POTENTIAL. 




FT 


DISULFID 


162 


213 


POTENTIAL, 




FT 


DISULFID 


268 


316 


POTENTIAL. 




FT 


DISULFID 


358 


408 


POTENTIAL, 




FT 


DISULFID 


452 


501 


POTENTIAL. 




FT 


DISULFID 


543 


592 


POTENTIAL. 




FT 


CARBOHYD 


78 


78 


N-LINKED (GLCNAC. . 


) (POTENTIAL). 


FT 


CARBOHYD 


218 


218 


N-LINKED (GLCNAC. . 


) (POTENTIAL). 


FT 


CARBOHYD 


290 


290 


N-LINKED (GLCNAC, . 


) (POTENTIAL). 


FT 


CARBOHYD 


409 


409 


N-LINKED (GLCNAC. . 


) (POTENTIAL). 


FT 


CARBOHYD 


483 


483 


N-LINKED (GLCNAC. . 


) (POTENTIAL). 


FT 


<?ARBQHYD 


576 


576 


N-LINKED (GLCNAC. . 


) (POTENTIAL). 


FT 


CARBOHYD 


581 


581 


N-LINKED (GLCNAC, . 


) (POTENTIAL). 


FT 


CARBOHYD 


595 


595 


N-LINKED (GLCNAC. . 


) (POTENTIAL). 


FT 


CARBOHYD 


692 


692 


N-LINKED (GLCNAC. . 


) (POTENTIAL). 


FT 


CARBOHYD 


778 


778 


N-LINKED (GLCNAC. . 


) (POTENTIAL). 



FT 


CARBOHYD 


834'. 


834 


N-LINKED (GLCNAC. . 


,) (POTENTIAL) 


FT 


CARBOHYD 


885 


885 


N-LINKED (GLCNAC, , 


.) (POTENTIAL) 


FT 


CARBOHYD 


969. 


969 


N-LINKED (GLCNAC, . 


.) (POTENTIAL) 


FT 


CARBOHYD 


985 ; 


985 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


995, 


995 


N-LINKED (GLCNAC. . 


,) (POTENTIAL) 


FT 


CARBOHYD 


1048' 


1048 


N-LINKED (GLCNAC. . 


,) (POTENTIAL) 


FT 


CARBOHYD 


1059, 


1059 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


.1091-. 


1091 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


VARSPLIC 


612. 


621 


MISSING (IN ISOFORM 


AS10). 


FT 


VARSPLIC 


1027 


1038 


MISSING (IN ISOFORM AS12). 


FT 


VARSPLIC 


1039 


1131 


MISSING (IN ISOFORM AS93). 


FT 


VARSPLIC 


1202 


1205 


MISSING (IN ISOFORM AS-CYT2). 


FT 


CONFLICT 


209> 


209 


V -> E (IN REF. 2). 




FT 


CONFLICT 


680 


680 


H -> Q (IN REF. 2). 




SQ 


SEQUENCE 


12841AA; 141851 MW; A3570BF9C3D47A0F CRC64; 



Query Match 8.2%; 
Best Local Similarity 22.lt; 
Matches 214; Conservative l: 



Score 561; DB 1; Length 1284; 
Pred. No. 2.4e-23; 

8; Mismatches 389; Indels 226; Gaps 33; 



Qy 30 PVIIEH-PIDVWSRGSPATLNCGAK PSTAKITWYKDGQPVITNKEQVNSHRIVLD 84 

I ! : I 1:1 : Ml II :l ::l :l 

Db 41 PTITQQSPKDYIVDPRENIVIQCEAKGKPPPS--FSWTRNGT HFDID 85 

Qy 85 TGSLFLLKVNSG KNGKDSDA- -GAYYCVASNEHGEVKSNEGSLK - -LAMLREDF 134 

: :| Ml II ::| I I I I II I II :: : I 
Db 86 KDAQVTMKPNSGTLWNIMNGVKAEAYEGVYQCTARNERGAAISNNIVIRPSRSPLWTKE 145 

Qy 135 RVRPRTVQALGGEMAVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPV 194 

:: I I: , |: II I II I I |:: I h h M :hl I 
Db 146 KLEPNHVRE--GDSLVLNCRPPVGLPPPIIFW-MDNAFQRLPQSERVSQGLNGDLYFSNV 202 

Qy 195 DRSDSGT -YQCVA- -NNMVGERVSNPARLSVFE'KPKFEQEPKDMT VDV-GA 241 

I: I f I I: : I : II il I: I :l I:: I 

Db 203 QPEDTRVDYICYARFNHTQTIQQKQPISVKVFSTKPVTERPPVLLTPMGSTSNKVELRGN 262 

Qy 242 AVLFDCRVTGDPQPQITWKRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPA 301 

:| :| Mill:: :|| : : |:| | :| I I Mil 
Db 263 VLLLECIAAGLPTPVIRWIKEGGELPANRTFFENFKKTLKIIDVSEADSGNYKCTARNTL 322 

Qy 302 GTLEASAHLRVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPS 361 

I: : |:|| : M : : I II III: I I 
Db 323 GSTHHVISVTVKAAPYWITAPRNLVLSPGEDGTLICRANGNPKPSISWLTNG VPI 377 

Qy 362 YVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQKKK 421 

:: : ::' h M II I I 
Db 378 AIAPEDPSRKVDGDTIIFSAVQERSSAVYQCNASNEYG 415 

Qy 422 SKMGKQKQKN\'QSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPT 481 

II:: I |:|| : :: :: I I" I II 
Db 416 : YLLANAFVNVLAEPPRILTPANKLYQVIADSPALIDCAYFGSPK 459 

Qy 482 PGISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVE 541 

I M : : I |:| I :| II |||:|:i: |:: I |: 
Db 460 PEIEWFRGVKGSILRGNEYVFHDNGTLEIPVAQKDSTGTYTCVARNKLGKTQNEVQLEVK 519 

Qy 542 DHT - SNAQFV RMPDPSNFPSSPTQPII 567 

II Ml :M I I 

Db 520 DPTMIIKQPQYKVIQRSAQASFECVIKHDPTLIPTVIWLKDNNELPDDERFLVGKDNLTI 579 

Qy 568 VNVTDTE— - V 575 

Mil : ■ = 
Db 580 MNVTDKDDGTYTCIVNTTLDSVSASAVLTWAAPPTPAIIYARPNPPLDLELTGQLERSI 639 

Qy 576 ELHWNAPSTSGAGPITGYIIQY-— YSPDLGQTWFNIPDYVASTEYRIKGLKPSHSYMF 631 

II I I III ::hl M : :| M M I I :| I 
Db 640 ELSW-VPGEEMNSPITNFVIEYEDGLHEPGVWHYQTEVPG--SHTTVQLK-LSPYVNYSF 695 

Qy 632 VIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQ- - -LIKLEEV 688 

: I II I . : I |:h I I : : Ml :| M 

Db 696 RVIAVNEIGRSQP SEPSEQYLTKSANPDENPSNVQGIGSEPDNLVITWESL 746 



' 1 Best Available Copy 
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Db 



689 KTINSTAVRLFWKKRKLEELIDGYYIKWRGPPRTNDNQYVNVTSPSTENYVVSNLMPFTN 748 

I I ] :| : II : |::: :| : hll I 

747 KGFQSNGPGLQYK VSWR- -QKDVDDEWTSVWANVSKYIVSGTPTFVP 792 



Qy 749 YEFFVIPYHSGVHSIHGAPSNSMDVL--TAEAPPSLPPEDVRIRMLNLTTLRISWKAPKA 806 

III :: : || | :|: : | | : | :|:: ::| | :: | 
Db 7j93 YEIKV — QALNDIjGYAPEPS-EVIGHSGEDLPMVAPGNVQVHVINSTIAKVHWDPVPL 847 

Qy 807 DGINGILKGFOIVIVGQAPNNNRNITTNERAASVTL FHLVTGM — TYKIRVAA 857 

: I 1:1:: I : : :| : :| | :: |: :||: | 
Db 848 KSVRGHLQGYK-VYYWKVQSLSRRSRRHVEKKILTFRGNKTFGMLPGLEPYSSYKLNVRV 906 

Qy 858 RSNGGVG 864 

: I I 
Db 907 VNGKGEG 913 



.RAT 

CAML RAT STANDARD; PRT; 1259 AA. 
Q05695; 

01-FEB-1994 (Bel. 28, Created) 

01-OCM994 (Rel. 30, Last sequence update) 

01-OCT-2000 (Rel. 40, Last annotation update) 

NEURAL CELL ADHESION MOLECULE LI PRECURSOR (N-CAM Ll). 

L1CAM OR CAMLl. 

Rattus norvegicus (Rat). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus, 

[1] * 

SEQUENCE FROM N,A. 

MEDLINE-91372414; PubMed-1894011; 

Miura M., Kobayashi M., Asou H., Uyemura L; 

"Molecular cloning of cDNA encoding the rat neural cell adhesion 

molecule Ll. Two Ll isoforms in the cytoplasmic region are produced 

by differential splicing,"; 

FEBS Lett. 289:91-95(1991). 

-!- FUNCTION: CELL ADHESION MOLECULE WITH AN IMPORTANT ROLE IN THE 
DEVELOPMENT OF THE NERVOUS SYSTEM, INVOLVED IN NEURON-NEURON 
ADHESION, NEURITE FASCICULATION, OUTGROWTH OF NEURITES, ETC, BINDS 
TO AXONIN ON NEURONS. 

-!■ SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

-!- ALTERNATIVE PRODUCTS; TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 
PRODUCED BY DIFFERENTIAL SPLICING, 

-!- TISSUE SPECIFICITY: THE SHORTER ISOFORM IS PREDOMINANTLY FOUND IN 
THE BRAIN, WHILE THE LONGER ISOFORM IS FOUND IN THE PERIPHERAL 
, NERVOUS SYSTEM. 

-!- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

-!- SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III -LIRE DOMAINS. 

This SWISS -PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation • 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch) . 

EMBL; X59149; CAA41860.1; -. 
PIR; S17655; S17655. 
HSSP; P20241; 1CFB. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 4. 
PFAM; PF00047; ig; 6. 
PRINTS; PR00014; FNTYPEIII. 

Cell adhesion; Glycoprotein; Transmembrane; Repeat; Brain; 

Immunoglobulin domain; Signal; Alternative splicing. 

SIGNAL 1 19 BY SIMILARITY. 

CHAIN 20 1259 NEURAL CELL ADHESION MOLECULE Ll. 

DOMAIN 20 1122 EXTRACELLULAR (POTENTIAL) . 

TRANSMEM 1123 1145 POTENTIAL, 

DOMAIN 1146 1259 CYTOPLASMIC (POTENTIAL). 
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DOMAIN 


50 


120 


IG-LIKE C2-TYPE DOMAIN 
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DOMAIN 


150 


215 


IG-LIKE C2-TYPE DOMAIN 
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DOMAIN 


256, 


318 


IG-LIKE C2-TYPE DOMAIN 
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DOMAIN 


346 


410 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


440. 


503 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


53i 


599 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


827 


896 


FIBRONECTIN TTPE-III. 
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DOMAIN 


932. 


994 


FIBRONECTIN TTPE-III. 




FT 


DOMAIN 


1032 


1093 


FIBRONECTIN TTPE-III. 




FT 


SITE 


553' 


555 


CELL ATTACHMENT SITE (POTENTIAL) . 


FT 


SITE 


562 


564 


CELL ATTACHMENT SITE (POTENTIAL) . 
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CARBOHYD 


100. - 


100 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


202 


202 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


246 , 


246 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


293 


293 


N-LINKED (GLCNAC. , .) 


(POTENTIAL) . 


FT 


CARBOHYD 


432 


432 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


489 


489 


N-LINKED (GLCNAC. . .) 


(POTENTIAL), 


FT 


CARBOHYD 


504 


504 


N-LINKED (GLCNAC, , .) 


(POTENTIAL). 


FT 


CARBOHYD 


670 , 


670 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


725; ' 


725 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


776 


776 


N-LINKED (GLCNAC, . .) 


(POTENTIAL), 


FT 


CARBOHYD 


824> • 


824 


N-LINKED (GLCNAC. . .) 


(POTENTIAL), 


FT 


CARBOHYD 


848 


848 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 






875' 


875 


N-LINKED (GLCNAC, . ,) 


(POTENTIAL). 


FT 


CARBOHYD 


968 


968 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


978 • 


978 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


1021 


1021 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


1029. 


1029 


N-LINKED (GLCNAC, . .) 


(POTENTIAL). 


FT 


CARBOHYD 


1072 


1072 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


1106 .. 


1106 


N-LINKED (GLCNAC, , ,) 


(POTENTIAL). 


FT 


VARSPLIC 


1179 


1182 


MISSING (IN SHORT ISOFORM) , 


SQ 


SEQUENCE 


1259 AA; 140934 MW; 19681B022D8F24AB CRC64; 



Query Match , 8.1%; Score 557; DB 1; Length 1259; 

Best Local Similarity 22.3*; Pred. No. 3,9e-23; 

Matches 244; Conservative 156; Mismatches 429; Indels 264; Gaps 42; 

Qy 30 PVIIEH-PIDWVSRGSPATLNCGAK-PSTAKITWYKDG— QPVITNKEQVNSHRIVLD 84 

III I I :| I I: : I III :| II:: :|: 
Db 35 PVITEQSPRRLWFPTDDISLKCEARGRPQVEFRWTKDGIHFKP KEELG- - -VWH 87 

Qy 85 — TGSLFLLKVNSGKNGKDSDA — GAYYCVASNEHGEVKSNEGSLKLAMLREDFRV 136 

:J : I :| I I I I III I hi : :: I 
Db 88 EAPYSGSFTI. EGNNSFAQRFQGIYRCYASNNLGTAMSHE IQLVAEGAPK 136 

Qy 137 RP RTVQALGGEMAVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIID 192 

I : I:.- II II 1:11 : I : I I h I :: :|:| 
Db 137 WPKETVKPVEVEEGESWLPCNPPPSAAPLRIYW-MNSKILHIKQDERVSMGQNGDLYFA 195 



Qy 193 PVDRSDS--" GTYQCVANNMVGERVSNPARLSVFEKPKF-- 

I II: :' II : : II I : II: : 
Db 196 NVLTSDNHSDYICNAHFPGTRTIIQKEPIDLRV-KPTNSMIDRKPRLLFPTNSSSHLVAL 254 

Qy 239 VGAAVLFDCRVTGDPQPQITWKRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYAR 298 

I ::: :| I I I I I ::lll I I: h: | |:||| | | 
Db 255 QGQSLILECIAEGFPTPTIKWLHPSDPMPTDRVIYQNHNKTLQLLNVGEEDDGEYTCLAE 314 

Qy 299 NPAGTLEASAHLRVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLL 358 

II: : :: |:| I : II I II :| : |:| I I I : 
Db 315 NSLGSARHAYYVTVEAAPYWLQKPQSHLYGPGETARLDCQVQGRPQPEVTWRING — H 370 

Qy 359 FPSYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQ 418 

h I |:| : I: I I I I I: I : : h 

Db 371 SIEKVNKDQKYRIE-QGSLILSNVQPSDTMVTQCEARNQHGLLLANAYIYW-QLPARIL 428 

Qy 419 KKKSKMGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMV--GSSAILPCQA 476 

I III I Ihl I hi 

Db 429; TK DNQTYMAVEGSTAYLLCKA 449 

Qy 477 SGKPTPGISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSA 535 

I I I : M: : : I I :: I I I 1 1 : 1 1 1 I I I I : I I 
Db 450 FGAPVPSVQWLDEEGTTVLQDERFFPYANGHLGIRDLQANDTGRYFCQAANDQNNVTILA 509 
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Qy 537 SLTVEDHTSNAQFVRMP DPSNFPS 560 

:| h: I II III I 

Db 510 NLQVKEATQITQGPRSTIEKKGARVTFTCQASFDPSLQASITWRGDGRDLQERGDSDKYF 569 

Qy 561 SPTQPIIVN VTDTE 574 

I I ::| : :: 

Db 570 IEDGQLVIKSLDYSDQGDYSCVASIELDEVESRAQLLWGSPGPVPHLELSDRHLLKQSQ 629 

Qy 575 VELHWNAPSTSGAGPITGYIIQYYSPDLG-QTWFN— IPDYVASTEYRIRGLRPSHSYM 630 

I I I :|: II I h: :: : ||: :| II : II 
Db 630 VHLSW-SPAEDHNSPIEKYDIEFEDREMAPEKWFSLGKVPGNQTSTTLK— LSPYVHYT 685 

Qy 631 FVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAEKRLTSEQLIKLEEVKI 690 

I : II: I I II I I I : I :|| :|: I |: :| : :: 
Db 686 FRVTAINKYGPGEPSPVSETWTPE AAPEKNPVDVR-GEGNETNNMVITWRPLRW 739 

Qy 691 INSTAVRLFWRKRKLEELIDGYYIKWR'-GPPRTNDNQYVNVTSPSIENYVVSNLMPFTN 748 

:: I I ::|l I I I I I Mil I 
Db 740 MD WNAPQIQ YRVQWRPLGKQETWKEQTV SDPFLWSNTSTFVP 782 

|| 749 YEFFVIPYHSGVHSIHGAPSNSMDV-LTAEAPPSLPPEDVRIRMLNLTTLRISWKAPKAD 807 
1 III I:: I : : : i | : || | : | :|: : |: 

T)b 783 YEIKV — QAVNNQGKGPEPQVTIGYSGEDYPQVSPELEDITIFNSSTVLVRWRPVDIjA 838 

Qy 808 GINGILKGFQIVI—VGQAPNNNRNITTNERAASV-TLFH1.VTGM— -IYKIRVAARS 859 

: I 1:1: : I :: |:: : I :::|: :| : I I : 
Db 839 QVRGHLRGYNVTYWWKGSQRKHSKRHVHRSHMWPANTTSAILSGLRPYSSYHVEVQAFN 898 

Qy 860 NGGVGVSH GTSEVIMNQDTLEKHLAAQQENESFLYGLINKSHVPVIVIV 908 

1:1 : II: III: h II I: 

Db 899 GRGLGPASEWTFSTPEGVPGHPEAL HLECQSDTSLLLHWQPPLSHNGVLT - - 948 

Qy 909 AILIIFWIIIAYCYWRNSRNSDGRDRSFIRINDGSVHMASNNLWDVAQNPNQNPHYNTA 968 

I : : : I:: I ::| : ::|| I I : I 
Db 949 GfLLS YHPLDGESREQLFFNLSD * - PELRTHNL TNLNPDLQYRFQ 991 

Qy 969 GRMTMNNRNGQAL 981 

: I : 1:1: 
Db 992 LQATTHQGPGEAI 1004 
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Job time: 1222 sec 
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OM protein - protein search, using sw model 
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Title: 

Perfect score: 
Sequence: 

Scoring table: 



■arched: 



: Search time 559.8 
(without alignments) 
271.520 Million cell updates/sec 

US-09-540-245A-17 
6860 

1 MYYLGFYHTHTHTHTYINFD TAQRFRSIPRNNGIVTQEQT 1297 

BLOSOM62 

Gapop 10,0 , Gapext 0.5 
374700 seqs, 117207915 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimunji^ Match 0% 
Maximum Match 100% 
Listing first 45 summaries 



SPTREMBLJ.5:* 

1: sp.archea:* 

2: spjbacteria:* 

3: spjungi:* 

4: spjiuman:* 

5: spjnvertebrate:* 

6: spjiammal:* 

7; spjihc:* 

8; sp.organelle:* 

9: sp_phage:* 
10: sp_plant:* 
11: spjrodent:* 
12: sp_virus : * 
13: sp.vertebrate:* 
14: spjinclassified:* 

Pred. No. Is the number of results predicted by. chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution . 



SUMMARIES 



Query 



NO. 


Score Match Length 


DB 


ID 


Description 


DR HSSP; P56276; 1TLK. 
















DR INTERPRO; IPR001777; -. 


1 


6523.5 


95.1 


1273 


5 


044928 


044928 caenorhabdi 


DR INTERPRO; IPR003006; -. 


2 


4628 


67,5 


874 


5 


001632 


001632 caenorhabdi 


DR PFAM; PF00041; fn3; 3. 


3 


2232 


32,5 


423 


5 


P91572 


P91572 caenorhabdi 


DR PFAM; PF00047; ig; 5. 


4 


1588 


23.1 


1395 


5 


044924 


044924 drosophila 


DR PRINTS; PR00014; FNTYPEIII, 


5 


1587 


23.1 


1395 


5 


Q9W213 


Q9w213 drosophila 


SQ SEQUENCE 1273 AA; 139427 MW; 013E766B51A7BAD7 CRC64; 


6 


1505.5 


21.9 


1612 


11 


089026 


089026 mus musculu 




7 


1500.5 


21.9 


1651 


4 


Q9Y6N7 


Q9y6n7 homo sapien 




8 


1483.5 


21,6 


1651 


11 


055005 


055005 rattus norv 


Query Match 95.14; Score 6523,5; DB 5; Length 1273; 


9 


1385 


20.2 


1060 


11 


Q9QZI3 


Q9qzi3 rattus norv 


Best Local Similarity 97.34; Pred. No. 0; 


10. 


1323 


19.3 


1344 


11 


Q9Z2I4 


Q9z2i4 mus musculu 


Matches 1243; Conservative 1; Mismatches 1; Indels 33; Gaps 


11 


1249 


18,2 


859 


5 


Q9VPZ6 


Q9vpz6 drosophila 




12 


1120.5 


16,3 


823 


5 


Q9VQ10 


Q9vql0 drosophila 


Qy 24 NASNLAPVIIEHPIDVWSRGSPATLNCGAXPSTAKITWYKDGQPVITNKEQVNSHRIVL 83 


13 


674 


9,8 


1493 


11 


P97798 


P97798 mus musculu 


:|llllllllllll!llll!llllllllllll!lllllllllllllllllllllllll!l 


14 


658.5 


9.6 


2016 


5 


Q9V4J9 


Q9v4j9 drosophila 


Db 25 DASNLAPVII2HPIDVWSRGSPATLNCGAKPSTAKITWYKDGQPVITNKEQVNSHRIVL 84 


15 


658.5 


9.6 


2016 


5 


Q9NBA1 


Q9nbal drosophila 




16 


642 


9.4 


1377 


11 


P97603 


P97603 rattus norv 


Qy 84 DTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQA 143 


17 


640.5 


9.3 


1461 


4 


000340 


000340 homo sapien 




18 


639.5 


9.3 


1461 


4 


092859 


Q92859 homo sapien 


Db 85 DTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQA 144 


19 


635 


9.3 


1028 


11 


Q62682 


Q62682 rattus norv 



632,5 


9.2 


1443 


13 


Q90610 


Q90610 gallus gall 


616 


9.0 


1028 


11 


Q07409 


Q07409 mus musculu 


609 


8.9 


1099 


11 


P97527 


P97527 rattus norv 


608.5 


8.9 


1026 


11 


Q62845 


Q62845 rattus norv 


608.5 


8.9 


1100 


4 


094779 


094779 homo sapien 


607.5 


8.9 


1040 


13 


Q9W675 


Q9w675 brachydanio 


607 


8.8 


1375 


5 


Q94537 


Qy4aj/ drosopniia 


606 


8 ..8 


1026 


4 


094780 


094780 homo sapien 


601.5 


8.8 


1252 


11 


Q9JLI1 


Q9 j lil mus musculu 


596 


8,7 


1028 


11 


Q9JMB8 


Q9jmb8 mus musculu 












Q63198 rattus norv 


592 


8.6 


1018 


g 




Q28106 bos taurus 


589 


8 6 


1272 


13 


Q90924 


Ujuy*4 gaiius gan 


588 


8.6 


2222 


5 


097394 


oy / Ja4 arosopiuia 


587.5 


8.1> 


2221 


5 


Q9U1M1 


Q9ulml drosophila 


586 




I'n'in 


11 




P97528 rattus norv 


583 


o'c 

■8,5' 


1028 


4 


Q90Q52 


Q9uq52 homo sapien 


580,5 


8.5 


1240 




014631 


014631 homo sapien 


580!5 


IS 


1427 


13 


Q91562 


Q91562 xenopus lae 


577 


8. .4 


1248 


6 


Q9XT41 


Q9xt41 cercopithec 


572 


8/3 


1369 


13 


042414 


042414 gallus gall 


565 


8.2 


1009 


13 


093250 


093250 xenopus lae 


564 


8.2 


1259 


11 


Q9QY38 


Q9qy38 mus musculu 


562 


8.2 


920 


4 


Q9P232 


Q9p232 homo sapien 


562 


8.2 


1250 


11 


088971 


088971 mus musculu 


557 


8.1 


1445 


11 


Q63155 


Q63155 rattus norv 



RESULT 1 
044928 

ID 044928 PRELIMINARY; PRT; 1273 AA. 

AC 044928; 

DT 01-JUN-1998 (TrEMBLrel. 06, Created) 

DT 01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 

DE SAX- 3. 

GN SAX- 3. 

OS Caenorhabditis elegans. 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBIJaxID-6239'; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-98117250; PubMed-9458046; 

RA Zallen J. A., Yi B.A., Bargmann C.I.; 

RT "The conserved immunoglobulin superfamily member SAX-3/Robo directs 

RT multiple aspects of axon guidance in C. elegans."; 

RL Cell 92:217-227(1998). 

EMBL; AF041053; AAC38848.1; -. 
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oy 


144 


Db 


145 


Qy 


204 


Db 


205 


Qy 


264 


Db 


265 


Qy 


324 


Db 


325 


Qy 


384 


Db 


385 


Jy 


444 


1 


416 


Qy 


504 


Db 


476 


Qy 


564 


Db 


536 


Qy 


624 


Db 


596 


,Qy 


684 


Db 


656 


Qy 


744 


Db 


716 


! Qy 


804 


Db 


776 


Qy 
| 


864 
836 


w 

Qy 


924 


Db 


896 


Qy 


984 


Db 


956 


Qy 


1044 


Db 


1016 


Qy 




Db 


1076 


Qy 


1164 


Db 


1136 


Qy 


1220 



iiiimmiMimiiiiiiiiiiiiiiimiimiiiiiiiiiiiiiiiiii! 



IMIIIinilllllMllllllllllllllllllllillMIIIMIIIIMIIMII 



imiiimmiiiiiiiiiiiiiiiimiiimiimiiiiiiiiimiiiii 



IIIIMIIIIIIIIIIIIIIIllll I 



II 

■-AV 415 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



MM 1 1 1 1 1 M 1 1 J 1 1 [ Ml! [ 1 1 1 1 1 1 ! 1 1 1 1 1 J [ 1 1 II I ! 1 1 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



iiiiiiiiiMiiiiiiiimiiiiiiiiiiMiiiiiiimiiiiiiiiiiiiiiii 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



immiiimiiiimiiiiimmmiiiiimiiiiiimiiniMi! 



Illlllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiini 

GVSHGTSEVIMNQDTLEKHLAAQQENESFLYGLINKSHVPVIVIVAILI IFWI I IAYCY 895 



IIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIII 

WRNSRNSDGKDRSFIKINDGSVHMASNNLWDVAQNPNQNPMYNTAGRMTMNNRNGQALYS 955 



IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIII 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



IIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 



GGHVYD — TAT RRQ LNRGST PREDT YDSVSDG AFARVDVNARPTSRNRNLGGRPLKGK 1219 

Mini iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiini 



1220 RDDDSQRSSLMMDDDGGSSEADGENSEGDVPRGGVRKAVPRMGISASTLAHSCYGTNGTA 1279 



llllllllllilllllllllllllllllllllllllllllllllllllllllllllllll 
1196 RDDDSQRSSLMMDDDGGSSEADGENSEGDVPRGGVRKAVPRMGISASTLAHSCYGTNGTA 1255 

1280 QRFRS I PRNNG IVTQEQT 1297 

llllllllllllllllll 
1256 QRFRSIPRNNGIVTQEQT 1273 



RESULT 2 
001632 

ID 001632 PRELIMINARY; PRT; 874 AA, 

AC 001632; - 

DT Ol-JtJL-1997 (TrEMBLrel, 04, Created) 

DT Ql-JUL-1997 (TrEMBLrel, 04, Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 

DE CODED FOR BY C. ELEGANS CDNA CEESC12R. 

GN ZK377 .2. . ; 

OS Caenorhabditis olegans, 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabdltoidea; 

OC Rhabdltidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxID-6239; 

RN [1] 

RP SEQUENCE FROM N,.A, 

RC STRAIN-BRISTOL H2; 

RX MEDLINE-94150718; PubMed-7906398; 

RA Wilson R., Ainscough R. ( Anderson K. , Baynes C, Berks M,, 

RA Bonfield J., Burton J., Connell M., Copsey T., Cooper J., Coulson A. , 

RA Craxton M., Dear S., Du Z., Durbin R., Favello A., Fulton L., 

RA Gardner A., Green P., Hawkins T., Hillier L., Jier M. , Johnston L., 

RA Jones M, , Kershaw J., Kirsten J., Laister N., Latreille P., 

RA Lightning J., Lloyd C, Mcmurray A., Mortimore B., O'Callaghan M, , 

RA Parsons J., Percy C, Rifken L,, Roopra A., Saunders D., Shownkeen R., 

RA Smaldon N,, Smith A., Sonnhammer E., Staden R., Sulston J., 

RA Thierry-Mieg J., Thomas K., Vaudin M., Vaughan K. ( Waterston R,, 

RA Watson A,, Weinstock L., Wilkinson-Sproat J., Wohldman P.; 

RT "2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 

RT elegans."; 

RL Nature 368:32-38(1994). 

RN [2] 

RP SEQUENCE FROM N,A. 

RC STRAIN-BRISTOL N2; 

RA Nhan M., Hawkins J.; 

RL Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL N2; 

RA Waterston R.; 

RL Submitted (FEB-+997) to the EMBL/GenBank/DDBJ databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL 112; 

RA Waterston R.; 

RL Submitted (APR-1997) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; U88183; AAB52657.1; •. 

DR HSSP; P56276; iTLK. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 1, 

DR PRINTS; PR00014; FNTYPEIII. 

SQ SEQUENCE 874 AA; 95861 MW; BC72270818D734C9 CRC64; 



Query Match , 67.5%; Score 4628; DB 5; Length 874; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 874; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 424 MGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPG 483 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
Db 1 MGKQKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPG 60 

Qy 484 ISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIARNEDGESTWSASLTVEDH 543 

mi inn mm iiiiiiiiiiiiiiii illinium illinium inn 
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Db 


61 


Qy 


544 


Db 


121 


Qy 


604 


Db 


181 


Qy 


664 


Db 


241 


Qy 


724 


Db 


301 


| 


784 


Db 


361 


Qy 


844 


Db 


421 


Qy 


904 


Db 


481 


Qy 


964 


Db 


541 


Qy 


1024 


Db 


601 


Qy 


1084 


Db 


661 


Qy 


1144 


Db 


(721 


> 


1204 


Db 


' 781 


Qy 


1264 


Db 


841 



61 ISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDH 120 



IIIIIMIIIIIMMIIMIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIII 



iiiiiimiiiiiiimiiiiimiiiiiiiiiiiiiiimiimmiiiiiii 



IIIIIIIINIIIIIIIilMMIIIIIIIIIIIIIMMIIMIIIIllllMIIIIII 



iiiiiiiiiiiiiiimimiiiimiiiiiiiiiimiiiiiiimiiimii 



IIMIIIIIII!!!!!I!IIIIIIIII!II!IIIII!I!I!IIII!IIII!I|||||||| 



IIIIMIIIMIIIIIIIIIIMIIIIIIIIIIimillMIIIIIIIIIIIIIIIIII 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



llllllllimiMIIIIIIIIMIIIIIIIIIMIIIIMIIIIIimillllllll 



1 1 1 1 1 1 1 i 1 1 M i 1 1 1 M ! 1 1 1 M i 1 1 1 1 M 1 1 1 1 i 1 1 1 M II 1 1 1 M I ! 1 1 1 1 ! 1 1 1 1 1 



T 1 1 ! M 1 1 1 1 1 1! I i M M ! 1 1 1 1 1 M 1 1 M I i I f 1 1 ! [ M [ I [ ! M 1 1 ! I ! 1 1 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



llllllilllllllllllllllllllllllllllllllllllllllllllllllllllll 

GSSEADGENSEGDVPRGGVRKAVPRMGI 840 



iiiniiiiiiiiiiiiiiiiniiiiiiiiiii 



RESULT 3 
P91572 



P91572 PRELIMINARY; PRT; 423 AA. 
P91572; 

01-MAY-1997 (TrEMBLrel. 03, Created) 
01-MAY-1997 (TrEMBLrel. 03, Last sequence update) 
01-MAY-2000 (TrEMBLrel. 13, Last annotation update) 
SIMILAR TO THE IMMUNOGLOBULIN SUPERFAMILY. 
ZK377.3. 

Caenorhabditis elegans. 

Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

Rhabditidae; Peloderinae; Caenorhabditis. 

NCBIJaxID-6239; 

[1] 

SEQUENCE FROM N.A. 

STRAIN-BRISTOL N2; 

MEDLINE-94150718; PubMed-7906398; 

Wilson R., Ainscough R., Anderson R,, Baynes C, Berks M., 



Bonfield J., Burton J., Connell M. , copsey T., Cooper J., Coulson A., 
Craxton M., Dear.S., Du z., Durbin R,, Favello A., Fulton L., 
Gardner A., Green P,, Hawkins T,, Hillier L., Jier M., Johnston L,, 
Jones M., Kershaw J,, Kirsten J., Laister N., Latreille P,, 
Lightning J., Lloyd C, Mcmurray A., Mortimore B., O'Callaghan M., 
Parsons J., Percy C, Rifken L., Roopra A., Saunders D,, Shownkeen R. 
Smaldon N., Smith A,, Sonnhammer E., Staden R., Sulston J., 
Thierry-Mieg J., Thomas K. ( Vaudin M,, Vaughan K. , Waterston R. , 
Watson A., Weinstock L., Wilkinson -Sproat J., Wohldman P.; 
"2,2 Mb of contiguous nucleotide sequence from chromosome III of C. 



Nature 368:32-38(1994). 
[2] 

SEQUENCE FROM N.,A. 
STRAIN-BRISTOL N2; 
Nhan M., Hawkins J.; 

Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 
[3] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
Waterston R, ; ' 

Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 
[4] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
Waterston Re- 
submitted (APR-1997) to the EMBL/GenBank/DDBJ databases. 
EMBL; U88183; AAB52658.1; -. 
HSSP; P56276; 1TLK. 
INTERPRO; IPR003006; -. 
PFAM; PF00047; ig; 4. 

SEQUENCE 423 AA; 46544 MW; EB4530DB6BD575E5 CRC64; 



Query Match 32.54; Score 2232; DB 5; Length 423; 

Best Local Similarity 100.0%; Pred. No. 2.6e-151; 

Matches 423; Conservative 0; Mismatches 0; Indels 0; 



Gaps 



Qy 1 MYYLGFYHTHTHTHTYINFDKIPNASNLAPVIIEHPIDVWSRGSPATLNCGAKPSIAKI 60 

Db 1 MYYLGFYHTHTHTHTYINFDKIP^ 60 

Qy 61 TWYKDGQPVITNKEQVNSHRIVLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKS 120 

Db 61 TWYKDGQPVITNKEQVNSHRIVLDTGSLFLLPNSGKNGKDSDAGAYYCVASNEHGEVKS 120 

Qy 121 NEGSLKLAMLREDFRVRPRTVQALGGEMAVLECSPPRGFPEPWSWRKDDKELRIQDMPR 180 

IIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIII! 

Db 121 NEGSLKLAMLREDFRVRPRTVQALGGEMAVLECSPPRGFPEPWSWRKDDKELRIQDMPR 180 

Qy 181 YTLHSDGNLliDPVDRSDSGTYQCVANNMVGERVSNPARLSVFEKPKFEQEPKDMTVDVG 240 

111111111111111111111111 III II Mil II III lllll III III Mil II HIM 

Db 181 YTLHSDGNLIIDPVDRSDSGTYQCVANNMVGERVSNPARLSVFEKPKFEQEPKDMTVDVG 240 

Qy 241 AAVLFDCRVTSDPQPQITWKRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNP 300 

lllllllllilllllllllllllllllllllMIIIIIIIIIIIIIIIIIIIIIIIIIII 

Db 241 AAVLFDCRVTGDPQPQITWKRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNP 300 

Qy 301 AGTLEASAHLSVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFP 360 

INMIIIIhllllllllllMIIIIIIMIIIIMIIIIIMIIIIIIIIIIIIIIIII 

Db ■ 301 AGTLEASAHLSVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFP 360 

Qy 361 SYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGKNSAGSSLSKAALKATFETKGRVQKK 420 

lllllll'lllJlllllllllllilllllllllllllllllllllMIIIINIIIIIIM 

Db 361 SYVSADGRTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQKR 420 



421 KSK 423 

III 

• 421 KSK 423 
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ID 
AC 
DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OC 
OX 
RN 
RP 
RX 
RA 
RA 
RT 
RT 
RL 
DR 

DR 
DR 



044924 PRELIMINARY; PRT; 1395 AA. 
044924; 

01-JON-1998 (TrEMBLrel. 06, Created) 
01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 
01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 
ROUNDABOUT 1. 
R0B01. 

Drosophila melanogaster (Fruit fly) . 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

Ephydroidea; Drosophilidae; Drosophila. 

NCBI.TaxID-7227; 

[1] 

SEQUENCE FROM N.A. 
MEDLINE-98117249; PubMed-9458045; 

Kidd T,, Brose K., Mitchell K.J., Fetter R.D., Tessier-Lavigne M., 
Goodman C.S., Tear G.; 

"Roundabout controls axon crossing of the CNS midline and defines a 

novel subfamily of evolutionarily conserved guidance receptors."; 

Cell 92:205-215(1998). 

EMBL; AF040989; AAC38849.1; -. 

HSSP; P56276; 1TLK. 

FLYBASE; FBgn0005631; robo. 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003006; -. 

PFAM; PF00041; fn3; 3. 

PFAM; PF00047; ig; 5. 

PRINTS; PR00014; FNTYPEIII. 

1395 AA; 151778 MW; B820E234A5218983 CRC64; 



Query Match 23.1%; Score 1588; DB 5; Length 1395; 

Best Local Similarity 31.0%; Pred. No. 1.7e-104; 

Matches 421; Conservative 195; Mismatches 527; Indels 216; Gaps 41; 

2y 29 APVIIEHPIDVWSRGSPATLNC--GAKPSTAKITWYKDGQPVITNKEQVNSHRIVLDTG 86 

:l Mill hll : HUM II I Nihil II::: III: I 
Db 55 SPRIIEHPTDLWKKNEPATLNCKVEGKPEPT ■ IEWFKDGEPVSTNEKK - - SHRVQFKDG 111 

Jy 87 SLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGG 146 

:|| : II I: I I hill I h I l|::hlhllll h : I 
Db 112 ALFFYRTMQGK--KEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKG 169 

Jy 1147 EMAVLECSPPRGFPEPWSWRKDD KELRIQDMPRYTLHSDGNLIIDPVDRSDSG 200 

I hill IhUII : I II I : I : II: h I I 
Db 170 ETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEG 229 

3y 201 TYQCVANNMVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWK 260 
hhl hll I h 1:1 I II I :||ll : I I I I III I:: I 
230 NYKCIAQNLVGTRESSYAKLIVQVKPYFMKEPRDQVMLYGQTATFHCSVGGDPPPKVLWK 289 

261 RKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQT 320 

:: :lhll I I : I I : hill III I I I : I I I I MM 
290 KEEGNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTK 349 

321 KPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIE 380 

:h:: II I I I h Ihlll hlh h II h III I 
350 RPSNKKVGLNGWQLPCMASGNPPPSVFWTREGVSTLMFPN-SSHGRQYVAADGTLQIT 407 

381 EVRQVDEGAYVCAGMNSAGSSLSKAALRATFETRGRVQKKKSKMGKQKQRNVQSIIRYLI 440 

:IH III III: : II : h : 
408 DVRQEDEGYYVCSAFSWDSSTVRVFLQVS 437 

441 SAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRI 500 

: Hll h I Mil II I 1 1 f : I : I hi Mil: :| 
438 SVDERPPPIIQIGPANQTLPRGSVATLPCRATGNPSPRIRWFHDGHAVQ-AGNRY 491 

501 SQHSTGSLHIADLKKPDTGVYTCIARNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPS 560 

I II : II: hi III I I IhMhllll I : I III :|: 
492 SIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVERPGSTS-LHRAADPSTYPA 550 

561 SPTQPIIVNVTDTEVELHW--NAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEY 618 

I I Ml: I : I I : III II ::|:INI I I h 



Db 551 PPGTPKVLNVSRTSISLRWAKSQERPGAVGPIIGYTVEYFSPDLQTGWIVAAHRVGDTQV 610 

Qy 619 RIRGLRPSHSYMFVIRAENEKGIGTPSVSSALVTTSRPAAQVALSDRNRMDMAIAERRLT 678 

I II I MMIM M II I :: h: I II 

Db 611 T ISGLT PGTS YVFLVRAENTQGISVPSGLSNVIKT I EADFDAASAN DLSAARTLLT 666 

Qy 679 SEQLIKLEEVKTINSTAVRLFWRKRKL-EELIDGYYIKWRGPPRTNDNQY--VNVTSPS 734 

: M : 1 1 1 1 1 1 I |: ::| I :: MM I 

Db 667 GRS-VELIDASAINASAVRLEWMLRVSADERYVEGLRIHYR-DASVPSAQYHSITVNDAS 724 

Qy 735 TENYWSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNL 794 

M II' :| Hlh h :l I Mil II I II IMM I I 
Db 725 AESFWGNLKKYTRYEFFLTPF— FETIEGQPSNSRTALTYEDVPSAPPDNIQIGMYNQ 781 

Qy 795 TTLRISWRAPKADGINGILRGFQI-VIVGQAPNNNRNITTNERAASVTLFHLVTGMTYKI 853 

I MM III l::l II hi I II I :| || I : 
Db 782 TAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVIANMTLNATTTSVLLNNLTTGAVYSV 841 

Qy 854 RVAARSNGGVGVSHGTSEVIMNQDTLERHLA AQQENESFLYGLINRSHVP- 903 

I: : : I I : I: I I M : I I ::| 

Db 842 RLNSFTKAGDGPYSKPI SLFMD • PTHHVHPPRAHPSGTHDGRHEGQDLTYH ■ • NNGNI PP 898 

Qy 904 " VIVI VAI LI IFWI 1 1 AYCYWRNSRNSDGKDRSFIKI ND 942 

hi ; :|:: : |: I" : ::| 
Db 899 GDINPTTHRKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRRHQMTRELGHLSWSD 958 

Qy - 943 GSVHMASNN^ " ■ -LWDVAQNPNQNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDY 998 

: MMI : : Ml : : I I M :|| I 
Db 959 NEITALNINSRESLW IDHHRGWRTADTDKDSGLSESKLLSHVNSSQSNYNNSD- - 1011 

Qy 999 SGTMHRPGSEHHYHYAQLTGGPGNAMSTFYG-NQYHDDPSPYATTTLV 1045 

II I- lh: ::||| : IMIIII! :: 

Db 1012 GGT "--DYAEV---DTRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETCTRTT 1058 

Qy 1046 - LSNQQPA- -WLNDKMLRAPAMPTN PVPPE--PPARYAD 1079 

Mil: IMM I III lh 
Db 1059 SISADRDSGTHSPYSDAFAGQVPAVPWKSNYLQYPVEPINWSEFLPPPPEHPPPSSTYG 1118 

Qy 1080 HTAG- ■ RRSRSSRASDGRG TLNGGLHHRTSGSQRS DSPPHTDVSY 1122 

: I II I I I || ;| :| : I I 

Db 1119 YAQGSPESSRKSSKSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSN 1178 

Qy 1123 VQLHSSDGTGSSRERTGERRTPPNKTLMDFIPPPPSNPPPPGGHVYDTATRRQ 1175 

: lh: MMI I I II h 11:1 
Db 1179 PLSAVAGGTQNRYQITPTNQHPPQLPAY ■ FATTGPGGAVPP - NHL - PFATQRHAASEYQA 1235 

Qy 1176 -LNRGSTPREDTYDS ■ VSDGAFARVDVNA- ■ -RPTSRNRNL- • ■ 1211 

II : :| I :| : I I: Ml I : 

Db 1236 GLNAARCAQSRACNSCDALATPSPMQPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQC 1295 

Qy 1212 GGRPLKGKRDDDSQRSSLMMDDDGGSSEADGENSEGDVP 1250 

I I : I ::M h: I : I 
Db 1296 SSECSDHSRSSQSHKRQLQLEEHGSSAKQRGGHHRRRAP 1334 



RESULT 5 
Q9W213 

ID Q9W213 PRELIMINARY; PRT; 1395 AA. 
AC Q9W213; 

DT 01-MAY-2000 (TrEMBLrel, 13, Created) 
DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
ROBO PROTEIN. 



Drosophila melanogaster (Fruit fly) . 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

Euhydroidea; Drosophilidae; Drosophila. 

NCBI TaxID-7227; 

[1] 

SEQUENCE FROM N.A. 
STRAIN-BERKELEY; 

MEDLINE-20196006; PubMed=10731132; 



Best Available Copy 
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RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Araanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.P., 

RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C, Rogers Y.-H.C, Blazej R.6., Champe M., Pfeiffer B.D., 

RA Wan K.H., Doyle c, Baxter E.G., Helt 6., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P. v., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busain D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawjjgy s., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos b., Del'cher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista c.C, Ferraz c, Ferriera S., Fleischmann W., 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser R., 

RA Glodek A., Gong F. , Gorrell J.H., Gu Z., Guan P., Harris M. ( 

•Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 
Hostin D,, Houston K.A., Rowland T.J., Wei M.-H., Ibegwam C, 
Jalali M, , Kalush F., Karpen G.H,, Ke Z., Kennison J. A,, Ketchum K.A., 

RA Rimmel B.E., Rodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A.A., Li J., Li Z., Liang Y,, Lin X., 

RA Liu X., Mattel B., Mcintosh I.e., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina H.V., Mobarry C, Morris J., Moshrefl A., 

RA Mount S.M., Moy M. , Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittntan G.S., Pan S., Pollard J., Puri v., Reese M.G., 

RA Reinert R., Remington K., Saunders R.D.C., Scheeler F., Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M,, Skupski M.P., Smith T., 

RA Spier E., Spradling A.C., Stapleton M. , Strong R., Sun E., 

RA Svirskas R., lector C, Turner R., Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J,, 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J,, Yen R.-F., Zaveri J.S., Zhan M,, Zhang G., Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S,, Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster.'; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003458; AAF46887.1; -. 

DR HSSP; P56276; 1TLK. 

DR FLYBASE; FBgn0005631; robo. 

DR INTERPRO; IPR001777; ■. 

DR INTERPRO; IPRO03OO6; ■. 

DR PFAM; PF00041; fn3; ;3. 

DR PFAM; PF00047; ig; 5, 

M PRINTS; PR00014; FNTYPEIII. j 

A SEQUENCE 1395 AA; 151759 MW; 25CED7DEB44F13F0 CKC64; 



Query Match 23.1%; Score 1587; DB 5; Lenbth 1395; 

Best Local Similarity 31.0%; Pred. No. 2e-104; J 

Matches 421; Conservative 194; Mismatches 528; Itfdels 216; Gaps 41; 

Qy 29 APVIIEHPIDWVSRGSPATLNC--GAKPSTAKITWYKDGQPVITNKEQVNSHRIVLDTG 86 

: lllll hll : llllll II I hllhll II::: III: I 
Db 55 SPRIIEHPTDLWKRNEPATLNCKVEGKPEPT-IEWFKDGEPVSTNEKK-SHRVQFKDG 111 

Qy 87 SLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGG 146 

:H : 111:11 hill I I: I 1 1 : : 1 : 1 1 : 1 1 1 1 |: : I 
Db 112 ALFFYRTMQGK--KEQDGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKG 169 

Qy 147 EMAVLECSPPRGFPEPWSWRKDD KELRIQDMPRYTLHSDGNLI IDPVDRSDSG 200 

I 1:111 Ihl Ml : I II I : I : llhl |: I I 
Db 170 ETALLECGPPKGIPEPTLIWIKDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEG 229 

Qy 201 TYQCVANNMVGERVSNPARLSVFEKPRFEQEPRDMTVDVGAAVLFDCRVTGDPQPQITWK 260 

1:1:1 hll I I: hi I II I :|lll : I llllll h: II 
Db 230 NYKCIAQNLVGTRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATFHCSVGGDPPPKVLWR 289 

Qy 261 RKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQT 320 
:: :ll:ll I I : I I : l:lll lllll I : I I I I llhl 
i. Db 290 KEEGNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTK 349 



Qy 321 RPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIE 380 

I ■!' I II]: Ihlll hlh h II h III I 
Db 350 RPSNKKVGLNGWQLPCMASGNPPPSVFWTKEGVSTLMFPN- - SSHGRQHVAADGTLQIT 407 

Qy 381 EVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQRKKSKMGRQKQKNVQSIIKYLI 440 

:IM III Ml: : II : h : 
Db 408 DVRQEDEGYYVCSAFSWDSSTVRVFLQVS 437 

Qy 441 SAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRI 500 

: 'AW h I MM II I llhhl hi IMI: :| 
Db 438 SLDERPPPIIQIGPANQTLPRGSVATLPCRATGNPSPRIKWFHDGHAVQ-AGNRY 491 

Qy 501 SQHSTGSLHIADLKRPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPS 560 

I II HI: hi III I ! imhhll!! I : I III :h 
Db 492 SIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGSTS-LHRAADPSTYPA 550 

Qy 561 SPTQPIIVNVTDTEVELHW--NAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEY 618 

I I ::lh'l :l|: Mill ::|:|||| I I |: 

Db 551 PPGTPKVLNVSRTSISLRWAKSQEKPGAVGPIIGYTVEYFSPDLQTGWIVAAQRVGDTQV 610 

Qy 619 RIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNRMDMAIAEKRLT 678 

I II I Ihimill :|| II I :| : I :: h: I II 

Db 611 TISGLTPGTSYVFLVRAENTQGISVPSGLSNTIKTIEADFDAASAN — DLSAARTLLT 666 

Qy 679 SEQLIKLEEVKTINSTAVRLFWKKRKL- -EELIDGYYIKWRGPPRTNDNQY- -VNVTSPS 734 

: ::l : - ll-llll I h ::| I :: II : I I 

Db 667 GKS-VELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYR-DASVPSAQYHSITVMDAS 724 

Qy 735 TENYWSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRMLNL 794 

h:M II :| Hlh h :| I llll II I II lh:::| I I 
Db 725 AESFWGNLKKYTKYEFFLTPF- ■ -FETIEGQPSNSKTALTYEDVPSAPPDNIQIGMYNQ 781 

Qy 795 TTLRISWKAPKADGINGILKGFQI-VIVGQAPNNNRNITTNERAASVTLFHLVTGMTYKI 853 

I :l|: III h:| II hll II I :| II I ; 
Db 782 TAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMRVLANMTLNATTTSVLLNNLTTGAVYSV 841 

Qy 854 RVAARSNGGVGVSHGTSEVIMNQDTLEKHLA AQQENESFLYGLINKSHVP- 903 

h : : I : h I I ■: I : I I ::| 

Db 842 RLNSFTRAGDGPYSKPISLFMD • PTHHVHPPRAHPSGTHDGRHEGQDLT YH - - NNGNI PP 898 

Qy 904 VIVIVAILIIFWIIIAYCYWRNSRNSDGKDRSFIKIND 942 

; hi : :h: : h h: : ::| 

Db 899 GDINPTTHKKTTDYLSGPWLMVLVCIVLLVLVISAAISMVYFKRKHQMTKELGHLSWSD 958 

Qy 943 GSVHMASNN--:--LWDVAQNPNQNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDY 998 

: :| II : : : II : : II ::| :|| I 
Db 959 NEITALNINSRESLW IDHHRGWRTADTDRDSGLSESRLLSHVNSSQSNYNNSD" 1011 

Qy 999 SGTMHRPGSEHHYHYAQLTGGPGNAMSTFYG-NQYHDDPSPYATTTLV 1045 

II 1 lh: ::IM : hhlllll :: 

Db 1012 GGT — DYAEV--DTRNLTTFYNCRKSPDNPTPYATTMIIGTSSSETCTKTT 1058 

Qy 1046 .-- LSNQQPA- -WLNDKMLRAPAMPTN PVPPE--PPARYAD 107 y 

mil: hill llll lh 
Db 1059 SISADKDSGTHSPYSDAFAGQVPAVPWKSNYLQYPVEPINWSEFLPPPPEHPPPSSTYG 1116 

Qy 1080 HTAG-RRSRSSRASDGRG TLNGGLHHRTSGSQRS DSPPHTDVSY 1122 

: I Ihl III II :| :|l : II I 

Db 1119 YAQGSPESSRKSSRSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVYSN 1178 

Qy 1123 VQLHSSDGTGSSKERTGERRTPPNKTLMDFIPPPPSNPPPPGGHVYDTATRRQ 1175 

mi : m mi i i n h ihi 

Db 1179 PLSAVAGGTQNRYQITPTNQHPPQLPAY-FATTGPGGAVPP-NHL-PFATQRHAASEYQA 1235 

Qy 1176 -LNRGSTPREDTYDS VSDGAFARVDVNA— RPTSRNRNL— 1211 

II : :| I :l : I h III I : 

Db 1236 . GLNAARCAQSRACNSCDALATPSPMQPPPPVPVPEGWYQPVHPNSHPMHPTSSNHQIYQC 1295 

Qy 1212 GGRPLKGKRDDDSQRSSLMMDDDGGSSEADGENSEGDVP 1250 

I I : I ::: I |:: I : I 
Db 1296 SSECSDHSRSSQSHKRQLQLEEHGSSAKQRGGHHRRRAP 1334 
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O89026 PRELIMINABY ; PRT; 1612 AA. 
089026; 

Ql-NOV-1998 (TrEMBLrel. 08, Created) 

01-NOV-1998 (TrEMBLrel. 08, Last sequence update) 

01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DUTT1 PROTEIN, 
ROBOl OR D0TT1. 
Mus musculus (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
NCBI_TaxID=10090; 
bl] 

SEQUENCE FROM N.A. 
TISSUE-BRAIN; ** 

Wu M.C., Lowe H., Fordhara R., Rabbitts P.; 

'The mouse homologue of human DUTTl/H-robol gene: protein sequence and 

chromosomal location,"; 

Submitted (JUL-1998) to the EMBL/GenBank/DDBJ databases. 

EffiL; Y17793; CAA76850.1; •. 

HSSP; P56276; 1TLK , 

MGD; MGI:1274781; Robol. 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003006; -. 

PFAM; PP00041; fn3; 3. 

PFAM; PF00047; ig; 5. 

SEQUENCE 1612 AA; 176406 MW; 5F2988C544796B4B CRC64 ; 



Query Match 21.9%; Score 1505.5; DB 11; Length 1612; 

Best Local Similarity 32.14; Pred. No. 1.7e-98; 

Matches 402; Conservative 176; Mismatches 498; Indels 175; Gaps 34; 

30 PVIIEHPIDVWSRGSPATLNCGAR-PSTARITWYRDGQPVITNREQWSHRIVLDTGSL 88 
I hill l::IM Hill |: I I III |; I |:|: l|:: :||| 
Db 29 PRIVEHPSDLIVSRGEPATLNCKAEGRPTPTIEWYKGGERVETDRDDPRSHRMLLPSGSL 88 

Qy 89 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGEM 148 

I I:: I: : I I I III I II h lh:hlhlll I I II 
Db 89 FFLRI VHGRKSR • PDEGVTICVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEP 147 

Qy 149 AVLECSPPRGFPEPWSWRKDDRELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 

Ihll Nil III ilhll I I M :||:| I II I 

Db 148 AVMECQPPRGHPEPTISWKKDGSPLDDKD-ERITIRG-GKLMITYTRKSDAGKYVCVGTN 205 

Qy 209 MVGERVSNPARLSVFEKPKFEQEPRDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMPV 268 

Mill I -I hi 1:1 I : I :: I I : I I III I : I:: : :| 
Db 206 MVGERESEVAELTVXERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDGELPK 265 

•269 TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVP 328 
:| I :h hi :| I I I I I I I MM I II II I II II I 
Db 266 SR-YEIRDDHTLKIRKVTAGDMGSYTCVAENMVGRAEASATLTVQEPPHFWXPRDQWA 324 

329 AGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSY - -VSADGRTKVSPTGTLTIEEVRQVD 386 

I I hi I I II II :ll hill II : I II II III h: ! 
325 LGRTVTFQCEATGNPQPAIFWRREGSQNLLF-SYQPPQSSSRFSVSQTGDLTITNVQRSD 383 



387 EGAYVCAGMNSAGSSLSKAALKATFETKGRVQKRKSKMGKQKQKNVQSIIKYLISAVTGN 446 

I hi :| III ::H h II 

384 VGYYICQTLNVAGSIITKAYLE VTDV 409 

447 TPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQHSTG 506 

Mil I I III: I : II I hi I I I I :lh : Mil I :| 
410 IADRPPPVIRQGPVNQTVAVDGTLILSCVATGSPAPTILWRKDGVLVSTQDSRIKQLESG 469 

507 SLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPI 566 

II Mil III I 1:1111 : I:: II lh 1 1 : 1 : : I 

470 VLQIRYAKLGDTGRYTCTASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSRPE 529 

567 IVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGLKPS 626 

::|: III III I I III: :| hi : I : : llllll: 
530 VTDVSKNTVTLSWQPNLNSGATP-TSYIIEAFSHASGSSWQTAAENVKTETFAIKGLKPN 588 



Qy 627 HSYMFVIRAENERGIGTPS-VSSALVTTSRPAAQVALSDRNRMDMAIAERRLTSEQLIRL 685 

hh:|| I II II :| : I I : I :| :: I 

Db 589 AIYLFLV1AANAYGISDPSQISDPVRTQDVPPTSQGVDHKQ VQRELGNWLHL 641 

Qy 686 EEVRTINSTAVRLFWKKRRLEELIDGYYIKWRGPPRTN-DNQYV--NVTSPSTENYWSN 742 

::|:;| : | : : | || | :| ;: ::::: | :|: : |: : 
Db 642 HNPTILSSSSVEVHWTVDQQSQYIQGYRILYRPSGASHGESEWLVFEVRTPTRNSWIPD 701 

Qy 743 LMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRML--NLTTLRIS 800 

I HI I: : || I M || || | : | | : :: 

Db 702 LRRGVNYEIRARPF- - -FNEFQGADSEIRFAKTLEEAPSAPPRSVTVSRNDGNGTAILVT 758 

Qy 801 WKAPKADGINGILRGFQIVIVGQAPNNNRNITTNERAASVTLFHLVTGMTYRIRVAARSN 860 

I: I I lh:: ::: :l : I I : II : II h I : III : 
Db 759 WQPPPEDTQNGMVQEYKVWCLGNETRYHINRTVDGSTFSWIPSLVPGIRYSVEVAASTG 818 

Qy 861 GGVGV * — SHGTSEVIMNQDTLERHLAAQQENESFLYGLINRSHVPVIVIVAI 910 

I II ! III :| :| : :: :|: h : I 
Db 819 AGPGVKSEPQFIQLDSHGNPVSPEDQVSLAQQISDWRQPAFIAGIGAACWI I 871 

Qy 911 LIIFWIIIAYCYWRNSRNSD GKDRSF IRINDGSVHMASN— NLWDVAQN 958 

!::! : : i II I III : I ::| I :::: 
Db 872 LMVFSIWLYRHRRKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGGRPGLLNISE- 930 

Qy 959 PNQNPMYNTAGMMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTG 1018 

II I II I :: I : :| II II : II 

Db 931 PATQPWLADTWPNTGNNHNDCSINCCTAGNGNSDSNLTTYS — RPADCIANYNNQLDN 986 

Qy 1019 GPGNAM-.--STFYG NQYHD DPSPYATTTLV— LSH 1048 

I I I! II h :: • MUM h III 

Db 987 RQTNLMLPESTVYGDVDLSNRINEMRTFNSPNLRDGRFVNPSGQPTPYATTQLIQANLSN 1046 

Qy 1049 QQPAKLNDKMLRAPAMPTNPVPPEPPARYADHTAGRRSRSSRASDGRGTL NGGL 1102 

I ': I I I :| : :: 1 1 : J h I 
Db 1047 NMNNGAGDSSERHWRPPGQQRPEVAPIQYNIMEQNRLNRDYRAND- • -TIPPTIPYNQSY 1103 

Qy 1103 HHRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSRERTGERRTP--PNKTLM— DFIPPPP 1157 

I II I' I :|| : I III I : I I HIM 

Db 1104 DQNTGGSYNSSD RGSSTSGSQGHRKG-ARTPRAPRQGGMNWADLLPPPP 1151 

Qy 1158 SNPPP PGGHVY— DTATRRQLNRGSTP 1182 

-III I :| I INI 

Db 1152 AHPPPHSNSEEYNMSVDESYDQEMPCPVPPAPMYLQQDELQEEEDERGPTP 1202 



RESULT 7 
Q9Y6N7 

ID Q9Y6N7 PRELIMINARY; PRT; 1651 AA. 
Q9Y6N7; ■: 

01-KJV-1999 (TrEMBLrel. 12, Created) 
01-NOV-1999 (TrEMBLrel. 12, Last sequence update) 
01-OCT-200Q (TrEMBLrel. 15, Last annotation update) 
ROUNDABOUT 1. •: 



Homo sapiens (Human) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
NCBI TaxID-9606; 

[1] 

SEQUENCE FROM N.'A. 
MEDLINE=9811724'9; PubMed=9458045; 

Kidd T., Brose K., Mitchell R.J., Fetter R.D., Tessier-Lavigne M,, 
Goodman C.S., Tear G.; 

"Roundabout controls axon crossing of the CNS midline and defines a 

novel subfamily of evolutionarily conserved guidance receptors . " ; 

Cell 92:205-215(1998). 

EMBL; AF040990; AAC39575.1; -, 

HSSP; P56276; 1TLR. 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003006; ■. 

PFAM; PF00041; fn3; 3. 

PFAM; PF00047; ig; 5. 
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SQ SEQUENCE 1651 AA; 180928 MW; 9D98CD7CAB73074D CRC64; 



Query Match 21.9%; Score 1500,5; DB 4; Length 1651; 

Best Local Similarity 32.1%; Pred. No. 3.9e-98; 

Matches 408; Conservative 166; Mismatches 480; Indels 219; Gaps 36; 

Qy 30 PVIIEHPIDVWSRGSPATLNCGAK-PSTAKITWYKDGQPVITNKEQVNSHRIVLDTGSL 88 

I hill I::lhl llllll h I I III |: I hh lll::| =111 
Db 68 PRIVEHPSDLIVSRGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 127 

Qy 89 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSHEGSLKLAMLREDFRVRPRTVQALGGEM 148 

II:: I: : III III I II I: ll::|:ll:lll I I II 
Db 128 FFLRIVHGRKSR-PDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEP 186 

Qy 149 AMCSPPRGFPEPWSWRRDDRELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 
11:11 llll III :||:ll I :| II: I hi :lhl I II I 

fl87 AVMECQPPRGHPEPTISWKKDGSPLDDKD-ERITIRG-GKLMITYTRKSDAGKYVCVGTN 244 
209 MVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMPV 268 
III I I hi hi I : I :: I I : I I III I : h: : :| 
f Db 245 MVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFRCEARGDPVPTVRWRXDDGELPK 304 

Qy 269 TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVP 328 

:| I :|: hlhl llllll I llll I II II I II II I 
Db 305 SR-YEIRDDHTLKITOAGDMGSYTCVAENMVGKAEASATLTVQEPPHFVVKPRDQVVA 363 

i 

Qy 329 AGGTATFECTLVGQPSPAYFWSREGQQDLLFPSY--VSADGRTRVSPTGTLTIEEVRQVD 386 

I I M llllll :H hill II : I II II III |:: I 
Db 364 LGRTVTFQCEATGNPQPAIFWRREGSQNLLF-SYQPPQSSSRFSVSQTGDLTITNVQRSD 422 

Qy 387 EGAYVCAGMNSAGSSLSKAALKATFETKGRVQKKRSKMGKQKQKNVQSIIKYLISAVTGN 446 

I hi :| III ::M h II 

Db 423 VGYYICQTLNVAGSIITRAYLE VTDV 448 

Qy 447 TPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQHSTG 506 

:||| I I llh I : Ml hi I I I I :lh : MM I I 
Db 449 IADRPPPVIRQGPVNQTVAVDGTFVLSCVATGSPVPTILWRKDGYLVSIQDSRIKQLENG 508 

Qy 507 SLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPI 566 

II I III IMM Ihllll : I:: II lh lhh:| 

Db 509 VLQIRYAKLGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPE 568 



i 



567 IVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIRGLKPS 626 

: :|: III III II llh :| I :| : : I : lllllh 
569 VTDVSRNTVTLSWQPNLNSGATP-TSYIIEAFSHASGSSWQTVAENVKTETSAIRGLKPN 627 

627 HSYMFVIRAENEKGIGTPSVSSALVTTSKPAAQVALSDKNKMDMAIAERRLTSEQLIRLE 686 

|:|::|| I II II I I I II :| :: I :: I 

628 AIYLFLVRAANAYGISDPSQISDPVRT QDVLPTSQGVDHRQVQREL-GNAVLHLH 681 

687 EVRTINSTAVRLFWKKRKLEELIDGYYIRWRGPPRTNDNQ — YVNVTSPSTENYWSN 742 

::|::: : I : : I II I :| I I : I :|: : h : 
682 NPTVLSSSSIEVHWTVDQQSQYIQGYRILYR-PSGANHGESDWLVFEVRTPAKNSWIPD 740 

• 743 LMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRML-NLTTLRIS 800 

I III X- '• II I I I M lh I : I I : :l 

741 LRRGVNYEIKARPF— FNEFQGADSEIRFAKTLEEAPSAPPQGVTVSRNDGNGTAILVS 797 

801 WRAPRADGINGILRGFQIVIVGQAPNKNRNITTNERAASVTLFHLVTGMTYRIRVAARSN 860 

I: I I lh:: ::: :| : II : II : II |: I : III : 
798 WQPPPEDTQNGMVQEYRVWCLGNETRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTG 857 

861 GGVGV SHGTSEVIMNQDTLEKHLAAQQENESFLYGLINRSHVPVIVIVAI 910 

I II :|| :l :| : :: : :|: h : I 
858 AGSGVKSEPQFIQLDAHGNPVSPEDQVSLAQQISDWRQPAFIAGIGAACWI I 910 

911 LIIFWIIIAYCYWRNSRNSD GKDRSF IRINDGSVHMASN— NLWDVAQN 958 

I::| : : : II I I II : I ::| I :::: 
911 LMVFSIWLYRHRRRRNGLTSIYAGIRKVPSFTFTPTVTYQRGGEAVSSGGRPGLLNISE- 969 

959 PNQNPMYNTAGRMTMNNRNGQALYSLTPNAQDFFNNCDDYSGTMHRPGSEHHYHYAQLTG 1018 

II I II I :: I : :| II II Ml 



Db 970 PAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYS RPADCIANYNNQIDN 1025 

Qy 1019 GPGNAM---STFYG NQYHD DPSPYATTTLVLSN— 1048 

II II II h :: hlllll h II 

Db 1026 KQTNLMLPESTVYGDVDLSNRINEMRTFNSPNLRDGRFVNPSGQPTPYATTQLIQSNLSN 1085 

Qy 1049 v QQ PAWLN— DKMLRAPAMPTNPVPPEPP-ARYADH 1080 

llll I : III I I : 
Db 1086 NMNNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNRLNKDYRANDTVPPT I PYNQSYDQK 1145 

Qy 1081 TAGRRSRSSRASDGRGTLNGGLHHRTSGSQRSDSPPHTDVSYVQLHSSDGTGSSKERTGE 1140 

I I : I hi Mill I I 

Db 1146 TGGSYNSSDRGSS TSGSQ GHKR — G 1168 

Qy 1141 RRTP--PNRTLM—DFIPPPPSNPPP PGGHVY — D 1169 

III I : '"I I :Mlh:lll I :| I 

Db 1169 ARTPKVPRQGGMNWADLLPPPPAHPPPHSNSEEYNISVDESYDQEMPCPVPPARMYLQQD 1228 

Qy 1170 TATRRQLNRGSTP 1182 

: I I' I I 
Db 1229 ELEEEEDERGPIP 1241 



PRELIMINARY; 



PRT; 1651 AA. 



RESULT 8 
055005 

ID 055005 

AC 055005; 

DT 01-JUN-1998 (TrEMBLrel. 06, Created) 

DT 01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE TRANSMEMBRANE RECEPTOR ROBOl. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

OX NCBIJaxID-10116; 

RN [1] 

RP SEQUENCE FROM N:A. 

RC TISSUE-SPINAL CORD; 

RX MEDLINE-98117249; PubMed=9458045; 

RA Ridd T, , Brose R., Mitchell K.J., Fetter R,D, , Tessier-Lavigne M., 

RA Goodman C.S., Tear G.; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily of evolutionarily conserved guidance receptors."; 

RL Cell 92:205-215(1998). 

DR EMBL; AF041082; AAC39960.1; -. 

DR HSSP; P56276; 1TLK. 

DR INTERPRO; IPR001777; ■. 

DR INTERPRO; IPR003006; 

DR PFAM; PF00041; 'fn3; 3. 

DR PFAM; PF00047; ig; 5. 

RW Transmembrane, ■ 

1651 AA; 180746 MW; FA2452DD46E186B7 CRC64; 



Query Match ' 21.61; Score 1483.5; DB 11; Length 1651; 

Best Local Similarity 31.4%; Pred. No. 6.5e-97; 

Matches 395; Conservative 176; Mismatches 497; Indels 191; Gaps 33; 

Qy 30 PVIIEHPIDVVVSRGSPATLNCGAR-PSTARITWYKDGQPVITNREQVNSHRIVLDTGSL 88 

I hill h:lhl llllll h I I III h I hh Ml:: :||| 
Db 68 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYRGGERVETDRDDPRSHRMLLPSGSL 127 

Qy 89 FLLKVNSGRN3KDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGEM 148 

I :: h ': III III I II h f I : : 1 : 1 1 : 1 1 1 I I II 
Db 128 FFLRIVHGRKSR-PDEGVYICVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEP 186 

Qy 149 AVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 

IMM Hll',111 Mhll IMIM I hi MM I II I 
Db 187 AVMECQPPRGHPEPTISWRKDGSPLDDRD-ERITIRG-GKLMITYTRKSDAGKYVCVGTN 244 

•Qy 209 MVGERVSNPARLSVFERPRFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWRRRNEPMPV 268 

MM' M:M hi I : I :: I I : I I llll |:: : M 
Db 245 MVGERESRVADVTVLERPSFVRRPSNLAVTVDDSAEFRCEARGDPVPTFGWRKDDGELPK 304 
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Qy 269 TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVP 328 

:M :h hi :l I I I I I I I MM 1 II II I II II I 
Db 305 SR-YEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVOEPPHFWKPRDOWA 363 

Qy 329 AGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSr-VSADGRTKVSPTGTLTIEEVRQVD 386 

I I I: I I || II :|| |:||| II : I II II II: I:: I 
Db 364 LGRTVTFQCEATGNPQPAIFWRREGSQNLLF-SYQPPQSSSRFSVSQTGDLTVTNVQRSD 422 

Qy 387 EGAYVCAGMNSAGSSLSKAALKATFETKGRVQKKKSKMGKQKQKNVQSIIKYLISAVTGN 446 

I hi :| III ::M h II 

Db 423 VGYYICQTLNVAGSIITKAYLE VTDV 448 

Qy 447 IPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQHSTG 506 

:||| I I III: I : I I |:| I I I I :lh : Mil I =1 
Db 449 IADRPPPVIRQGPVNQTVAVDGTLTLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLESG 508 

Qy 507 SLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPI 566 

II I III III I Ihllll : I:: II II: IhhH 

Db 509 VLQIRYAKLGDTGRYTCTASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPE 568 

A 567 IVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKGLRPS 626 
M : :|: III III I I III: :| I :| : : I : : llllll: 

Tb 569 VTDVSKNTVTLLWQPNLNSGATP-TSYIIEAFSHASGSSWQTVAENVKTETFAIKGLKPN 627 

Qy 627 HSYMFVIRAENEKGIGTPS-VSSALVTTSRPAAQVALSDKNKMDMAIAEKRLTSEQLIKL 685 

! |:|::M I II II :| : I I : I =1 :: I 

Db 628 AIYLFLVRAANAYGISDPSQISDPVKTQDVPPTTQGVDHRQ VQRELGNWLHL 680 



16 EEVKTINSTAVRLFWKRRRLEELIDGYYIKWRGPPRTN-DNQYV--NVTSPSTENYWSN 742 
::|::| : I : : I II I :| :: ::::: l,-:|: : |: : 
681 HNPTILSSSSVEVHWTVDQQSQYIQGYRILYRPSGASHGESEWLVFEVRTPTKNSWIPD 740 



Qy 743 LMPFTNYEFFVIPYHSGVHSIHGAPSNSMDVLTAEAPPSLPPEDVRIRML--NLTTLRIS 800 

I III h : II I I I II II I : II::: 
Db 741 LRKGVNYEIRARPF- - -FNEFQGADSEIKFAKTLEERPSAPPRSVTVSKNDGNGTAILVT 797 

Qy 8jDl WKAPRADGINGILKGFQIVIVGQAPNMRNITTNERAASVTLFHLVTGMTYKIRVAARSN 860 

j 1:1 I II::: ::: :l : I I : II MM: III : 
Db m WQPPPEDTQNGMVQEYKVWCLGNETRYHINRTVDGSTFSVVIPFLVPGIRYSVEVAASTG 857 

Qy 861 GGVGV SHGTSEVIMNQDTLERHLAAQQENESFLYGLINKSHVPVIVIVAI 910 

I II III :| :| : :: : :|: I : I 

Db 858 AGPGVRSEPQFIQLDSHGNPVSPEDQVSLAQQISDWKQPAFIAG IGAAC 907 

Qy 911 LIIFWIIIAYCYWRHSRN SDGKDRSFIKI 940 

11:1 I I 1 1 11:1 
Db 908 WIILMVFSIWLYRHRKRRNGLSSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGGRPGLLNI 967 

• 941 NDGSVHMASNNLWDVAQNPNQNPMYNTAGRMTMNNRNGQALYSLTPNAQ DFFNN 994 
;: : : I II :| II : :|| :: :: I 
968 SEPATQPWLADTW PNTGNSHNDCSINCCTASNGNSDSNLTTYSRPADCIANYNNQ 1022 

Qy 995 CDDYSGTMHRPGSEHHYHYAQLTGGPGNAMSTFYGNQYHD DPSPYATTTLVL 1046 

I: : I I I I: I I II I Mill! h 

Db 1023 LDNKQTNLMLPEST-VYGDVDLS-NKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQ 1080 

Qy 1047 SNQQPAWLN DRMLRAPAMPTNPVPPEPPARYADHTAGRRSRSSRASDGRGTL-* 1098 

:| I :| : I I I :| : :: MM |: 
Db 1081 ANLINNMNNGGGDSSEKHWKPPGQQRQEV- - -APIQYNIMEQNRLNKDYRAND- - -HLP 1134 
I 

Qy 1099 — NGGLHHRX^GSQRSDSPPHTDVSYVQLHSSDGTGSSKERTGERRTP- - PNKTLM- - 1150 

I I II I I :|| : I III I : I 

Db 1135 TIPYNHSYDQNTGGSYNSSD RGSSTSGSQGHKKG - ARTPKAPKQGGMNW 1182 

Qy 1151 -DFIPPPPSNPPP PGGHVY — DTATRRQLNRGST P 1182 

I :||||::lll I :| I : II II 

Db 1183 ADLLPPPPAHPPPHSNSEEYSMSVDESYDQEMPCPVPPARMYLQQDELEEEEAERGPTP 1241 



Q9QZI3 

ID Q9QZI3 



PRT; 1060 AA, 



Q9QZI3; ? 

01-MAY-2000 (TrEMBLrel. 13, Created) • 
01-MAY-2000 (TrEMBLrel . 13, Last sequence update) 
01-OCT-2000 (TrEMBLrel . 15, Last annotation update) 
REPULSIVE GUIDANCE RECEPTOR (FRAGMENT) . 

Rattus norvegicus (Rat). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus, 
NCBIJaxID-10116; 
[1] 

SEQUENCE FROM N.A. 
MEDLINE-99200391; PubMed-10102268; 

Brose K., Bland:K.S., Wang R.H., Arnott D., Henzel W., Goodman C.S., 
Tessier-Lavigne M., Rldd t.; 

"Slit proteins bind Robo receptors and have an evolutionary 

conserved role in repulsive axon guidance."; 

Cell 96:795-806(1999). 

EMBL; AF182037; AAF04558.1; -. 

HSSP; P56276; lfLK. 

INTERPRO; IPR001547; -. 

INTERPRO; IPR001777; ■. 

INTERPRO; IPR003006; -. 

PFAM; PF00041; fn3; 3. 

PFAM; PF00047; ig; 5. 

PRINTS; PR00014; FNTYPEIII, 

PROSITE; PS00659; GLYCOSYL_HYDROL_F5 ; UNRNOWNJ. 

Receptor. 

NONJER 1060 ' 1060' 

1060 AA; 116790 MW; C4BC8C11E8542DA4 CRC64; 



Query Match 20.24; Score 1385; DB 11; Length 1060; 

Best Local Similarity 32,5*; Pred. No. 3.5e-90; 

Matches 355; Conservative 167; Mismatches 425; Indels 144; Gaps 30; 

Qy 30 PVIIEHPIDVWSRGSPATLNCGAK — PSTARIT WYRDGQPVITNKEQVNSH 79 

I :| I :|:||:| II I : : 

Db 31 PKXVEQPSEVIVSRGKPNTPNWRQRGRPFPTIGKVQRMVRPGWDR TRDDSRVTQ 84 

Qy 80 RIVLDTGSLFLLRVNSGRNGRDSDAGAYYCVASNEHGEVRSNEGSLRLAMLREDFRVRPR 139 

:| :|||| |:: I: I I I I III I II I M:M:M:IM I 
Db 85 GCLLPSGSLFFLRIVHGRRSR-PDEGTYVCVARNYLGEAVSRNASLEVALLRDDFRQNPT 143 

Qy 140 TVQALGGEMAVLECSPPRGFPEPWSWRRDDRELRIQDMPRYTLHSDGNLIIDPVDRSDS 199 

I II IMM III! Ill : |:|| ::|| : I hi Ml: 

Db 144 DVWAAGEPAILECQPPRGHPEPTIYWRKD--KVRIDEREERISIRGGRLMISNTRKSDA 201 

Qy 200 GTYQCVANNMVGERVSNPARLSVFERPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITW 259 

I I I! Il'/lll Ml |:|||:| I : I : I I hi Mill : I 
Db 202 GMYTCVGTNMVGERDSDPAELTVFERPTFLRRPINQWLEDEPAEFRCQVQGDPQPTVRW 261 

Qy 260 KRRNEPMPVTRAYIAXDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQ 319 

I: : M M ||: III:: Mil II! I I I MM I hill I 
Db 262 RRDDADLPRGRrYDIKDDYTLRIRRAISADEGTYVCIAENRVGKVEASATLTVRAPPQFV 320 

Qy 320 TKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSY-VSADGRTRVSPTGTLT 378 

Mil [MM I I II II II! Mill: : | Hill II 
Db 321 VRPRDQIVAQGRTVTFPCETRGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLT 380 

Qy 379 IEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETRGRVQRKKSKMGRQKQRNVQSIIRY 438 

I ;:; I I hi : III hll h 

Db 381 ITNIQRSDAGYYICQALTVAGSILAKAQLE 410 

Qy 439 LISAVTGNTPMPPPTIEHGHQNQTLMVGSSAILPCQASGRPTPGISWLRDGLPIDITDS 498 

II Ml | | llll I MM MM I I III:: I 
Db 411 — VTDVLTDRPPPIILQGPINQTLAVDGTALLKCRATG-PLPVISWLKEGFTFLGRDP 465 

Qy 499 RISQHSTGSLHIADLRRPDTGVYTCIARNEDGESTWSASLTVEDHTSNAQFVRMPDPSNF 558 

I : hi I :|: III II!:! : IMMII I I : I I : I :: 
Db 466 RATIQDQGTLQIKNLRISDTGTYTCVATSSSGETSWSAVLDVTE--SGATISRNYDTNDL 523 

Qy 559 PSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEY 618 
I |::| : :|| III III: llh :| : :| : ::! M I 
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Db 524 PGPPSRPQVTDVTRNSWLSWQT-GTPGVLPASAYIIEAFSQSVSNSWQTVANHVRTTLY 582 

Qy 619 RIKGLKPSHSYMFVIRAENEKGIGIPSVSSALVTTS-KPAAQVALSDKNKMDMAIAEKR 676 

::M:|: hlvll I :|: II I I I III :| :| 

Db 583 TVRGLRPNTIYLFMVRAINPQGLSDPSPMSDPVRTQDISPPAQ GVDHRQVQKE 635 

Qy 677 LTSEQLIKLEMTINSTAVRLFWRRRKLEELIDGYYIKWR- ■ -GPPRTNDNQYVNVTSP 733 

I : ::| ' I I : : I : : I 1 1 : : I I : I : : I 
Db 636 L-GDVTVRLHNPVMLTPTTVQVTWTVDRQPQFIQGYRVMYRQTSGLQASTVWQNLDAKVP 694 

Qy 734 STENYWSNLMPFTpEFFVIPYHSGVflSIHGAPSNSMDVLTAEAPPSLPPEDVRIRML- 792 

: : I!: II HI I II : I I I : I I II II: 
Db 695 TERSAVLVNLKKGf YEIKVRPY--FNEFQGMDSESKTIRTTEEAPSAPPQSVTVLTVG 751 

Qy 793 --NLTlJl 

I IS: :|| 

Db 752 SHNSTSISVSWDP*ADHQNGIIQEYKIWCLGNETRFHINKTVDATIRSWIGGLFPGIQ 811 

■y 851 YKIRVAARSNGGVGySH- 
W I:: Nil II 

Db 812 



jT^LRISWKAI 1ADGINGILKGFQIVIVGQAPNNNRNITTNEMSVTLFHLVTGMT 850 

I*' =11 1 II III:: ::l :l : I I : II : I I: 



■GTSEVIM- -NQDTLEKHLMQQENESFLYGLINKS 900 ■ 

I :||:: I :: |: |: 

871 



YRVEVAASTSAGVG rKSEPQPIIIGGRNEWITENNNSITEQITDWKQPAFIAGIGGAC 8 
i I 

Qy ,901 HVPVIVIVAILIimillAYCYWRNSRNSDGKDRSFIKINDGSVHMASN NLW 953 

I II: It: : III : I : I : II I 
Db 872 WV ILKSfiSl WLYWRRKRRR-GLSNYAVTFQRGDGGLMSNGSRPGLLNTG 919 

Qy 954 DVAQNPNQNPMYNTrAGRMTMNNRNGQALYSLTPNAQDFFNNCD DYSGTM 1002 

I H : : • I : :|| I II I I I : II 

Db 920 D — PNYPWLADSWPATSLPVNNSNS GPNEIGNFGRGDVLPPVPGQGDKTATM 969 

Qy 1003 HRPGSEHHYHYAQLTGGPGNAMSTFYGNQYHDDPSPYATTTLVLSNQ QPAW 1053 

I ll :|: : :||||| :: II II 
Db 970 LSDGA-IYSSIDFjT TRTTYNSSSQITQATPYATTQILHSNSIHELAVDLPDPQW 1022 

Qy 1054 LNDRMLRAPAM 10j64 

I 

Db 1023 KSSVQQKSDLM 1*3 



RESULT 10 
Q9Z2I4 I 

ID Q9Z2I4 PRELIMINARY; PRT; 1344 AA, ' 
AC Q9Z2I4; 

DT 01-MAY-1999. (TrEMBlfrel. 10, Created) 
DT 01-MAY-1999 (TrEMBlrel. 10, Last sequence update) 
01-OCT-2000 (TrEMBLjrel. 15, Last annotation update) 
RIG 4 PROTEIN. f 
IN RBIG1 , j I 
OS Mus musculus (Mouse! • 

OC Eukaryota; jMetazoa ;^ Chordata ; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; ; 'Rodentia; Sciurognathi; Muridae;;Murinae; Mus. 
OX NCBi.TaxID-10090; J 
RN [l]f ' t 
RP SEQUENCE FROM N.A. 

RA Yuan S.-S.F., Cox L.A., Dasika G.K., Lee E.Y.-H.P.; 
RL Submitted (APR-1998) to the EMBL/GenBank/DDBJ databases, 
DR EMBL; AF060570; AAD11628.1; -. 
DR HSSP; P56276; 1TLK. 
DR MGD; MGI: 1343102; Rbigl. 
DR INTERPRO; IPR001777; -. 
DR INTERPRO; IPR003006; -. 
DR PFAM; PF00041; fn3; 3. 
DR PFAM; PF00047; ig; 5. 

1344 AA; 143439 MW; 8B0060341C49CFEA CRC64; 



Query Match 19.3%; Score 1323; DB 11; Length 1344; 

Best Local Similarity 29.2%; Pred. No. 1.4e-85; 

Matches 391; Conservative 189; Mismatches 539; Indels 220; Gaps 42; 

Qy 16 YINFDKIPNA — SNLAPVIIEHPIDVWSRGSPATLNCGA — KPSTAKITWYKDGQ 67 

I: I : : I 1:1 I MINI Nil II :|: I llhl 
Db 24 YLELPSSPGSRVGPEDAMPRIVEQPPDLWSRGEPATLPCRAEGRPRPN- - - IEWYKNGA 80 



Qy 68 PVITNKEQVNSHRIVLDTGSLFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKL 127 

I I =1 ; :||::| :|:|| : : I : : I I I 1 1 1 I I I II:: 

Db 81 RVATAREDPRAHRLLLPSGALFFPRIVHGRRSR • PDEGVYTCVARNYLGAAASRNASLEV 139 

Qy 128 AMLREDFRVRPRTVQALGGEMAVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDG 187 

1:11:111 .1 I II 11:11 l|:| lll:|:|:| :|: I I: I 

Db 140 AVLRDDFRQSPGNVWAVGEPAVMECVPPKGHPEPLVTWKKGKIKLK-EEEGRITIRG-G 197 

Qy 188 NLIIDPVDRSDSGTYQCVANNMVGERVSNPARLSVFEKPKFEQEPKDMTVDYGAAVLFDC 247 

:||:| I IIIMI 1 1 1 I II I I : I I : I : I I I I I 

Db 198 KLMMSHTFKSDAGMYMCVASNMAGERESGAAELWLERPSFLRRPINQWLADAPVNFLC 257 



248 RVTGDPQPQITWKRKNEPMPVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEAS 307 

I Mill : ]:: : :l II : : I |::| III I I I I I III 
258 EVQGDPQPNLHWRKDDGELPAGR ■ YEIRSDHSLWIDQVS SEDEGT YTCVAENSVGRAEAS 316 

308 AHLRVQAPPSFQTKPADQSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSY-VSAD 366 

I I II J III I MM I I II II III I I'M : 

317 GSLSVHVPPQFVTKPQNQTVAPGANVSFQCETRGNPPPAIFWQKEGSQVLLFPSQSLQPM 376 

367 GRTKVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQKKKSKMGK 426 

II III I I. I II: I I III :: III Ml I I II 
377 GRLLVSPRGQLNITEVKIGDGGYYVCQAVSVAGSILAKALL EIKG 421 

427 QKQKNVQSIIKYLISAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISW 486 

I II I I llll::IM III: Mill 
422 ASIDG LPPIILQGPANQTLVLGSSVWLPCRVIGNPQPNIQW 462 

487 LRDGLPIDITDSRISQHSTGSLHIADLKRPDTGVYTCIAKNEDGESTWSASLTVEDHTSN 546 

= 1 : "ll: : hllll ::: I I |:|:||: ||:||:: I :: 
463 KKDERWLQGDDSQFNLMDNGTLHIASIQEMDMGFYSCVAKSSIGEATWNSWLRKQEDW-G 521 

547 AQFVRMPDPSNFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTW 606 

I I I 1:111: II : | | ill |:|: :| | tl 

522 ASPGPATGPSNPPGPPSQPIVTEVTANSITLTWKPNPQSGA-TATSYVIEAFSQAAGNTW 580 

607 FNIPDYVASTEYRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSRPAAQVALSDKN 666 

: 1. 1. II Ihl: hl-ll |: II I II : I 
581 RTVADGVQLETYTISGLQPNTIYLFLVRAVGAWGLSEPSPVSEPVQTQDSSLSRPAEDPW 640 

667 RMDMAIAERRLTSEQLIRLEEVRTINSTAVRLFWKRRRLEELIDGYYIKWRGPPRTNDN- 725 

I :| • I ::::] : ::: I :|: |: : II : 
641 RGQRGLA- -> - - -EVAVRMQEPTVLGPRTLQVSWTVDGPVQLVQGFRVSWRIAGLDQGSW 694 



726 QYVNVTSPSTENYWSNLMPFTNYEFFV-IPYHSGVHSIHGAPSNSMDVLTAEAPPSLPP 784 

::: II |: II : I : |: III: I II II 
695 TMLDLQSPHRQSTVLRGLPPGAQIQIRVQVQGQEGL- • ■ -GAESPFVTRSIPEEAPSGPP 750 

785 EDVRIRM--LNLTTLRISWRAPRADGINGILKGFQIVIVGQAPNNNRNITTNERAASVTL 842 

: I : : 5 ::: =11: I II" :|| :| : I : I III 
751 QGVAVALGGDRNSSVTVSWEPPLPSQRNGVITEYQIWCLGNESRFHLNRSAAGWARSVTF 810 

843 FHLVTGMTYRIRVAARSNGGVGVSHGTSEVIMN QDTLERHLAAQQEN 889 

I: I h III :: lllh :: I" : I : II 

811 SGLLPGQIYRALVAAATSAGVGVA—SAPVLVQLPFPPAAEPGPEVSEGLAERLAKVLRR 868 

890 ESFLYGLINR3HVPVIVIVAILIIFWIIIAYCYWRNSRNSDGKDRSFIKINDGSVHMAS 949 

:M I ::: :| I I: I : II 

869 PAFLAGSSAACG ALLLGFCAALYRRQKQRKELSH YTAS 906 

950 NNLWDVAQNPSQNPMYNTAGRMTMNNRNGQALY SLTPNAQDFFNNCDDYS 999 

!l: : :: I I III :|:||: :| 

907 FAYTPAVSFPHSEGLSGSSSRPPMG- -LGPAAYPWLADSWPHPPRSPSAQEPRGSC ■ • - - 960 

1000 GTMHRPGSEHHYHY AQLTG'GP GNAMSTFYG — NQYH 1033 

I M : I :| II I : Ihl 

961 -CPSNPDPDDRYYNEAGISLYLAQTARGANASGEGPVYSTIDPVGEELQTFHGGFPQHSS 1019 



Qy 1034 DDPSPYATTTLVLSNQQPAWL NDRMLRAPA-MPT NPVPPEPPARYAD 1079 

III II 1:1 I II: :M II: 

Db 1020 GDPSTWS- — -QYAPPEWSEGDSGARGGQGRLLGKPVQMPSLSWPEALPPPPPSCELS 1073 
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Qy 


1080 


Db 


1074 


Qy 


1140 


Db 


1116 


Qy 


1187 


Db 


1176 


Qy 


1232 


Db 


1235 



I! : I: I I :h 
■ -LEEWCPPVPEKSH- -LVGSSSSGACMVAPA 1115 



■ -NKTLMDFIPPPPSNPPPPGG- -HVYDTATRRQLNRG" 

:: Ml I II I:: I I 



■-STPREDT 1186 

I I : 



:| I II 



■-SRNRNLGGR-- 

I I : I 



■PLKGKRDDDSQRSSLMM 1231 

II I I :: : 



1:1 11:1 
■-PYRREHSPGDLP 1246 



RESULT 11 
09VPZ6 

ID Q9VPZ6 PRELIMINARY; PRT; 859 AA, 

AC Q9VPZ6; 

KT 01-MAY-2000 (TrEMBLrel, 13, Created) 

T 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

T 01-QCT-2000 (TrEMBLrel, 15, Last annotation update) 

DE CG5423 PROTEIN (FRAGMENT). 

GN CG5423, 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa? Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBIJaxID=7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BERKELEY; 

RX MEDLINE=20196006; PubMed-10731132; 

RA Adams M.D., Celniker s.e. , Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A., Lewis S,E,, Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.c, Rogers Y,-H.C, Blazej R.G., ChampeM., Pfeiffer B.D,, 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pfannkoch C, Baldwin D,, 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B,P., Bhandari D., Bolshakov S., 

RA Borkova D,, Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H,, Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K. , Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

^A Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W., 

M Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., 

PA Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M, , 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D, , Houston K.A., Howland T.J., Wei M.-H., Ibegwam c, 

RA Jalali M. , Kalush F., Karpen G.H., Re Z,, Kennison J. A., Retchum K,A,, 

RA Kimmel B.E., Kodira CD., Rraft C, Kravitz S., Rulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A, A,, Li J., Li Z., Liang Y., Lin X., 

RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D,, 

RA Merkulov G., Milshina N.V., Mobarry c, Morris J., Moshrefi A., 

RA Mount S.M., Moy M., Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri v., Reese M.G., 

RA Reinert K., Remington K., Saunders R.D.C., Scheeler F,, Shen H,, 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T., 

RA Spier E., Spradling A.c, Stapleton M,, Strong R., Sun E., 

RA Svirskas R., Tector C, Turner R., venter E. , Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J,, 

RA Williams S.M., Woodage T., Worley R.C, Wu D., Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., zhan M., Zhang G., Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A,, Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster.*; 

RL Science 287:2185-2195(2000). 



EMBL; AE003586; . AAF51388.1; -. 
HSSP; P56276; : 1?LK. 
FLYBASE; FBgn0031328; CG5423. 
INTERPRO; IPR001777; ■. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 3. 
PFAM; PF00047; ig; 5. 
NON.TER I' 1 

59 AA; 93916 MW; 



5CFD69D984101BF8 CRC64; 



Query Match 
Best Local Similarity 



18.2%; 

32.1%; 



Score 1249; DB 5; Length 859; 
Pred. No. 1.3e-80; 



Matches 


Qy 


30 


Db 


i 


Qy 


89 


Db 


58 


Qy 


149 


Db 


90 


Qy 


209 


Db 


150 


Qy 


267 


Db 


210 


Qy 


327 


Db 


269 


Qy 


384 


Db 


329 


Qy 


443 


Db 


361 


Qy 


503 


Db 


413 


Qy 


561 


Db 


473 


Qy 


621 


Db 


533 


Qy 


670 


Db 


588 


Qy 


730 


Db 


641 


Qy 


786 


Db 


698 


Qy 


841 



1:1 III I ! IIIMI I: I I Mllll I: 



i mi i i i 

-KILPGSHRITLPAGGL 57 



I III: I: 



: 1 : 1 1 : : 1 1 : I: : I: 
--RCAVLRDEFRLEPQNTRIAQGDT 89 



Mil: Mill M:| ::| : 



II I I: ,M III Mill IM I III III I M I 



I: I : :M Mil MM II I I M I Mil 



I I Ml :| I I II: 



I II I I: I 



■-GTLTIEEVR 383 

II: : 



Ml I I Ml M I I M I M III MM: |::: 
•-RPPPIIISGPVNQTLPIRSLATLQCKAIGLPSPTISWYRDGIPVQ-PSSKLNI 412 



:| I Ml :: I MIM : Mill I : 



:|J|:MII I II I :| I 



■-PAAOVALSDKNKMD 669 

:|: III 
3TGTSQLLLSD 587 



I I :i :::| I M II I : IMh I : M 
- -VETLLQANDWELLEANASDSTTARLSWDIDS-GQYIEGFYLYAR- • -ELHSSEYKM 640 



II I Ml Ml I II 



I :! IIM I III 



--VSHGTSEVIMNQDTLEKHLA 884 

: I : :::| : I 



Best Available Copy 
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Db 758 LLANLTTGVTYYIAVAAATRVGVGPFSKPAVLRIDARTQSLDTGYTRYPISRDIADDFL- 816 

Qy 885 AQQENESFLYGLINKSHVPVIVIVAILI IFWI I IAYCYWRNSRNSDGKDRSFIKIND-G 943 

::: 1= Ml" : :: : I III I 

Db 817 — TQTWFIVLLGS IIAIIVFLLGALVLF RRYQFIKQISLG 854 

Qy 944 SVH 946 

1:1 

Db 855 SLH 857 



RESULT 12 
Q9VQ10 

ID Q9VQ10 PRELIMINARY; PRT; 823 AA. 

AC Q9VQ10; 

dt OI-may-2000 (TrEMBLrel. 13, Created) 

DT OI-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

fOl-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
CG5481 PROTEIN. 
CG5481. 

OS Drosophila melanogaster (Fruit fly). 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; jNeoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroide?) Drosophilidae; Drosophila, 

OX KBIJaxID-7227; 

RN [1] 

RP SEQUENCE FROM N. A. 

RC STRAIN B BERKELEY; 

RX MEDLINE»20l96006; PubMed-10731132; 

RA Adams M.D.,J,Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides-jP.G., Scherer E.E., Li P.W., Hoskins R.A., Galle R.F,, 

RA George R.A;, Lewis S.E., Richards S., Ashburner M., Henderson S.N,, 

RA Sutton G.G.', Wortman J.R., Yandell M.D., Zhang Q,, Chen L.X., 

RA Brandon R.CT. , Rogers Y.-H.C, Blazej R.G., Champe M., Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F.jj' Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D,, 

RA Ballew R.Mv., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y^v Benos P.V., Berman B.P., Bhandari D., Bolshakov s,, 

RA Borkova D.potchan M.R., Bouck j., Brokstein P., Brottier P., 

RA Burtis K.Ci Busam D.A., Butler H,, Cadieu E., Center A., Chandra I., 

RA cherry j.Mlj Cawley S., Dahlke C, Davenport L.B., Davies P, f 

RA de Pablos U, Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J , Evangelista C.C., Ferraz C, Ferriera S., Fleischmann w., 

RA Fosler C.,i3abrielian A.E., Garg N.S., Gelbart W.M., Glasser K,, 

JjA Glodek A,,feong F., Gorrell J.H., Gu z., Guan P., Harris M., 

Harris N.lT, Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

W( Hostin D,, Houston K.A., Howland T.J,, Wei M.-H., Ibegwam C, 

jRA Jalali M,, iKalush F., Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A., 

|ra Kimmel B,E„ Kodira CD., Kraft C, Kravitz S., Kulp D., Lai z., 

RA Lasko P., Lei Y., Levitsky A.A., Li J., Li Z., Liang Y., Lin X., 

ra Liu x,, Mattei B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina NX, Mobarry C, Morris J,, Moshrefi A,, 

RA Mount S,M., Moy M., Murphy B,, Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D,R,, Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri v., Reese M.G,, 

RA Reinert K., Remington K., Saunders R.D.C., Scheeler F,, Shen H,, 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T., 

RA Spier E, , Spradling A.C, Stapleton M,, Strong R,, Sun E., 

RA Svirskas R, ( Tector c, Turner R., Venter E., Wang A.H., Wang X., 

ra Wang z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q,, Zheng L., 
iheng X.H., Zhong F.N., zhong W., Zhou X., Zhu S., Zhu X. ( Smith H.O., 



DR 



RA 



RA Gibbs R,A,, Myers E.W., Rubin G.M., Venter J.C., 
RT "The genome sequence of Drosophila melanogaster."; 
RL Science 287:2185-2195(2000). 



EMBL; AE003586; AAF51373.1; ■ 
HSSP; P56276; 1TLK. 
FLYBASE; FBgn0031341; CG5481. 
INTERPRO; IPR001412; -, 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 



PFAM; PF00041; fn3; 1. 

PFAM; PF00047; ig; 5. 

PRINTS; PR00014; FNTYPEIII. 

PROSITE; PS00178; AA_TRNA_LIGASE_I ; UNKNOWN 1, 

823 AA; 89715 MW; 36FC0B91F36F2F19 CRC64 ; 



Query Match 16.3%; Score 1120.5; DBS; Length 823; 

Best Local Similarity 30.3%; Pred. No. 1.9e-71; 

Conservative 129; Mismatches 296; Indels 181; 



:| I : I I III: : I I hill: : I lllhl I II III 



Matches 


Qy 


37 


Db 


l 


Qy 


96 


Db 


58 


Qy 


156 


Db 


116 


Qy 


216 


Db 


176 


Qy 


276 


Db 


236 


Qy 


336 


Db 


296 


Qy 


392 


Db 


354 


Qy 


449 


Db 


379 


Qy 


508 


Db 


437 


Qy 


566 


Db 


497 


Qy 


626 


Db 


557 


Qy 


668 


Db 


611 


Qy 


703 


Db 


671 


Qy 


735 


Db 


731 


Qy 


749 


Db 


791 



:|'ll I: I II I :| :|::| ||::||: I 



ll:|::|l 



III III :llll: : I 



III I 



:|| I INI 1:11 I I 



I I I :l : I:: I l|::|:l II: III I : |:l 



Mill I I I I: I I III I :| :| I 



II I I I. ':|| II II I I III :|: I 1:1:111 



I :|: II I: 



III II I 'MM I I :lll: I I Hi Ihlll: : I : I: 
-PPPIIEQGPVNQTLPVKSIWLPCRTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGA 436 



I 1:11:: ! 1 : 1 1 1 : 1 I :|:|:ll I :: I: I :| I |: I :| I :l 



I I I 



ll:|: : 



:! 1:11111 I: II I :| 



■ -MDMAIAEKRLTSEQLIKLEEVKTINSTAVRLFWK- ■ 
:|:| I I I :::l ::tl:::| I: 



■-GPPRTNDNQYVNVTSPS" 

I :| I : II I 



--TENYWSNLMPFTN 748 
: :: I: :| 



1:1::!:: . |: I llll I I 
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P97798 

ID P97798 PRELIMINARY; PRT; 1493 AA, 
AC P97798; 

DT 01 -may- 1997 (TrEMBLrel. 03, Created) 

DT OI-may-1997 (TrEMBLrel. 03, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE NEOGENIN (NEOGENIN PROTEIN). 

GN NEOl. 

OS Mus musculus (Mouse). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
OX NCBI TaxID-10090; 
RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=97407661; PubMed-9264410; 

RA Keeling S.L., Gad J.M,, Cooper H.M.; 

RT "Mouse Neogenin, a DCC-like molecule, has four splice variants and i 
RT expressed widely in the adult mouse and during embryogenesis . \ 
RL Oncogene 15:691-700(1997). 
EMBL; Y09535; CAA70727.1; -. 
HSSP; P02751; 1TTF. 
MGD; MGI:1097159; Neol. 
INTERPRO; IPR000531; -. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; ■. 
PFAM; PF00041; fn3; 6. 
PFAM; PF00047; ig; 4. 
PRINTS; PR00014; FNTYPEIII. 
PROSITE; PS00430; TONBJEPENDENT REC 1; UNKNOWN 1. 
SEQUENCE 1493 AA; 163159 MW; 441DE919D5E17C0E CRC64; 



Query Match 9.8%; Score 674; DB 11; Length 1493; 

Best Local Similarity 21,8%; Pred. No. 3.8e-39; 

Matches 334; Conservative 193; Mismatches 551; Indels 452; Gaps 64; 

Qy 36 PIDVWSRGSPATLNCGA-KPSTAKITWYRDGQPVITNKEQVNSHRIVLDTGSLFLLKV 93 

hi : III III I :|| II III I I : I :| 1 1 1 1 : I 
Db 70 PVDTLSVRGSSVILNCSAYSEPS-PNIEWKKDG-TFLNLES-DDRRQLLPDGSLFISNV 125 

Qy 94 NSGKNGRDSDAGAYYCVASNEH-GEVKSNEGSLKLAMLREDFRVRPRTVQALGGEMAVLE 152 

I: I III III: I : I I hi 

Db 126 VHSKHNK-PDEGFYQCVATVDNLGTIVSRTAKLTVAGLPR-FTSQPEPSSVYVGNSAILN 183 

Qy 153 CSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANNMVGE 212 

I I I I : I : I I I hi II hh : 

Db 184 CEVNADL-VPFVRWEQNRQPLLLDD--RIVKLPSGTLVISNATEGDGGLYRCIVESGGPP 240 

Qy 213 RVSNPARLSVFEKPK FEQEPKDMTVDVGAAVLFDCRVTGDPQPQ IT WKRKNEPM 266 

I: I II : I: | i | : : | |: | | | : | : | ; 
241 KFSDEAELKVLQDPEEIVDLVFLMRPSSMMKVTGQSAVLPCVVSGLPAPWRWMRNEEVL 300 

Qy 267 PVTRA-YIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPAD 324 

: : III I I I I I I hi! I I II II I :||: 
Db 301 DTESSGRLVLLAGGCLEISDVTEDDAGTYFCIADNGNKTVEAQAELTVQVPPGFLKQPAN 360 

Qy 325 QSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSYVSADGRTKVSPTGTLTIEEVRQ 384 

III : 1:1:1 111::! |: I : : : 

Db 361 IYAHESMDIVFECEVTGKPTPTVKWVKNG- -DWIPS DNFKIVKEHNLQVLGLVK 413 

Qy 385 VDEGAYVCAGMNSAGSSLSKAAL KATFETKGRVQKKKSKMGK 426 

III I I I h: : I I III : 

Db 414 SDEGFYQCIAENDVGNAQAGAQLIILEHDVAIPTLPPTSLTSAT-TDHLAPATTGPLPS 471 

Qy 427 QKQKNVQSIIKYLISAVTGNTPAKPP PTIEHGHQN — 461 

: I :: Mill | : | 

Db 472 APRDWASLVSTRFIKLTWRTPASDPHGDNLTYSVFYTKEGVDRERVENTSQPGEMQVTI 531 

Qy 462 QTLMVGSSAILPCQASGK PTPGI-SWLRDGLPIDIT- 496 

111:111 III:: I :| 

Db 532 QNLMPATVYIFKVMAQNKHGSGESSAPLRVETQPEVQLPGPAPNIRAYATSPTSITVTWE 591 



Qy *497 ■ 



- -DSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGE 531 



Db 



I :| I I I III : :| h I 
592 TPLSGNGEIQNYKLYYMERGTDKEQDIDVSSH- - -SYTINGLKKYTEYSFRWAYNKHGP 648 

532 STWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTE-VELHWNAP-STSGAGP 589 

'■■ill: h II: I : : I ::: : :|| I lh I 
649 GV •; r - -STQDVAVRTLSDVPSAAPQNLSLEVRNSKSIVIHWQPPSSTTQNGQ 696 



Qy 590 ITGYIIQY— : 597 

MM hi • 

Db 697 ITGYKIRYRKASRKSDVTETLVTGTQLSQLIEGLDRGTEYNFRVAALTVNGTGPATDWLS 756 

Qy 598 — YSPDLGQTWFNIPDYVASTE 617 

: 'II :h :| 

Db 757 AETFESDLDET - -RVPEVPSSLHVRPLVTSIWSWTPPENQNI WRGYAIGYGIGSPHAQ 814 

Qy 618 YRIRGLKPSHSYMFVIRAENERGIGTPSVSSALVTTSKPAAQVALSDKNK 667 

Th III h ::l I III lh ::| :| :: 
Db 815 TIKVDYKQRYYTIENLDPSSHYVITLKAFNNVGEGIPLYESAV--TRPH TDTSE 866 

Qy 668 MDMAIAERRLT SEQLI KL - EEVKT IKST AVRLFWKKRKL — EEL IDG — Y Y I KW 716 

:h : h :::::: :|: I I ::: I I ::| 
Db 867 VDLFVINAPYTPVPDPTPMMPPVGVQASILSHDTIRITWADNSLPRHQKITDSRYYTVRW 926 

Qy 717 RGPPRTN- - -DNQYVNVTSPSTENYWSNLMPFTNYEFFVI • - • PYHSGVHSI • ■ HGAPS 768 

:li ■■'■■\ I : M :hh I I I III h I h III 
Db ' 927 "-KTNIPANTKYRNANA-TTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGA-- 979 

Qy 769 NSMDVLTAEAPPSLPPEDVRI--RMLNLTTLRISWKAPRADGINGILKGFQIVIVGQAPN 826 

! I h Ihll : : h ::h I II : h h I 
Db 980 TFELVPTSPPRDVTWSREGKPRTIIVNWQPPSE--ANGKITGY-IIYYSTDVN 1030 

Qy 827 NNRNITTNERAASVTLFHLVTGMT — YKIRVAARSNGGVGVSHGTSEVIMNQDTLEKH 882 

: I . I I : :| I :: lh: hi II : : I : 
Db 1031 AEIHDWVIEPWGNRLTHQIQELTLDTPYYFKIQARNSRGMG- - -PMSEAVQFR-TPKAD 1086 

Qy 883 LAAQQENESFLYGLINKSHVP VIVIVAIL 911 

: : I: I I :| MM :: 

Db 1087 SSDRMPNDQAIiGSAGRGSRLPDLGSDYKPPMSGSNSPHGSPTSPLDSNMLLVIIVSVGVI 1146 

Qy 912 IIFWIIIAYCYWRNSRNSDGKDRSFIRINDGSVHMASN NLW DV 955 

I IE : : 1 1 ., I : : I I: I Ml I Ml : 
Db 1147 TIVWWIAVFCTRRTTSHQKRKRAACRSVNGSHKYRGNCKDVKPPDLWIHHERLELKPI 1206 

Qy 956 AQNPNQNPMYNTAGRMTMNNRNGQALYSLTP-NAQDFFNNCDDYSGTMHRPGSEHHYHY 1013 

:M: lh I II I Ml hi :|: : : I 

Db 1207 DKSPDPNPVMTD TPIPRNSQ- - -DITPVDNSMD SNIHQRRNSYRGHE 1250 

Qy 1014 AQLTGGPGNAMSTFYGNQYHDDPSPYATTTLVLSNQQPAWLNDKMLRAPAMPTNPVPPEP 1073 

:: ::|;ll I : : lh II : MM 

Db 1251 SE DSMSTLAGRR GMRPKMM — MPFDSQPPQP 1279 



Qy 1074 PARYADHTAGRRSRSSRASDGRGTL NGGLHHRTSGSQRSDSPPH 1117 

I h II II I I ill::: :l 

Db 1280 VISAHPIHSLDNPHHHFHSSSLASPARSHLYHPSSPWPIGTSMSLSDRANSTESVRNTPS 1339 

Qy 1118 TDVSYVQ— - LHSSDGTGSSKERTGERRTP PNKTLMDFIPPPPS 1158 

II : II MM : I hill 

Db 1340 TDTMPASSSQTCCTDHQDPEGATSSSYLASSQEEDSGQSLPTAHVRPSHPLKSFAVPA" 1397 

Qy 1159 NPPPPGGHVYDTATRRQLNRGSTP REDTYDSVSDGAFARVDVNARPTSRNRNL 1211 

MM Ml I III I: II : : Ml 

Db 1398 -IPPPGPPLYDPAL PSTPLLSQQALEPSTFHSVKTASIGTLG-RSRPP 1443 

Qy 1212 GGRPLKGKRDDDSQRSSLMMDDDGGSSEAD 1241 

I: : I :: hM I I I 
Db 1444 "MPWVPSAPEVQETTRMLEDSESSYEPD 1471 



RESULT 14 
Q9V4J9 

ID Q9V4J9 PRELIMINARY; PRT; 2( 
AC Q9V4J9; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 



Best Available Copy 

Mon Jan 22 13:04:36 2001 us-09-540-245a-17.rspt Page 



DT 01-MAY-2Q0Q (TrEHBLrel. 13, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE CG17800 PROTEIN . 

GN CG17800. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Meta2oa; Arthropoda; Trachea ta; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBIjaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BERKELEY; 

RX MEDLINE=20196006; PubMed»10731132; 

RA Adams M.D., Celniker S.E., Holt R,A, , Evans C.A., Gocayne J.D., 

■ RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R,A., Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashbumer M,, Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

K Brandon R.c, Rogers Y.-H.C, Blazej R.G., Champe M., Pfeiffer B.D., 
Wan K.H., Doyle c, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 
Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L,, Beasley E.M., 

RA Beeson K.Y,, Benos P. V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA purbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W., 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 

RA Glodek A., Gong E?, Gorrell J.H., Gu Z., Guan P., Harris M., 

RA' Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostih D., Houston K.A., Howland T.J., Wei M.-H,, Ibegwam C, 

RA Jalali M., Kalush F., Karpen G.H., Ke I., Rennison J. A., Ketchum K.A., 

RA Kinmel B.E., Kodira CD,, Kraft C, Kravitz S,, Rulp D., Lai Z., 

RA Lasko P,, Lei Y., Levitsky A. A,, Li J., Li Z., Liang Y., Lin X,, 

RA Liu X,, Mattei B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M., Murphy B,, Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K. f Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S,, Pollard J., Puri v., Reese M.G., 

RA Reinert K,, Remington K. , Saunders R.D.C., Scheeler F., Shen H,, 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T., 

RA Spier E., Spradling A.c, Stapleton M., Strong R,, Sun E., 

RA Svirskas R., lector C, Turner R., venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

K Williams S.M., Woodage T. ( Worley K.C., Wu D., Yang S., Yao Q.A., 
Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang 'G., Zhao Q., Zheng L., 
Zheng X.H., zhong F.N., zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003841; AAF59271.1; -. 

DR HSSP; P40189; 1BQU. 

DR FLYBASE; FBgn0033159; CG17800. 

DR INTERPRO; IPR000267; -. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 6. 

DR PFAM; PF00047; ig; 10. 

DR PROSITE; PS00144; ASNJ3LNJSE 1; UNKNOWN 1, 

SO SEQUENCE 2016 AA; 222109 MW; 64A8DE3BB7BD0AB0 CRC64; 



Query Match 9.6%; Score 658.5; DB 5; Length 2016; 

Best Local Similarity 26.04; Pred. No. 7.7e-38; 

Matches 229; Conservative 130; Mismatches 369; Indels 153; Gaps 

3y 30 PVIIEHPIDVWSRGSPATLNC - -GAKPSTAKITWYKDGQPVITNKEQVNSHRIVLDTGS 87 

III : : : I Mill :|:| II: : I : :: 

Db 429 PVIRQAFQEETMEPGPSVFLKCVAGGNP-TPEISWELDGKKIANNDRYQVGQYVTVNGDV 487 

2y 88 LFLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGE 147 
: I : I :| I I 1:1 :; : | | || : : : : I! 



Db 


488 


Qy 


148 


Db 


542 


Qy 


207 


Db 


599 


Qy 


261 


Db 


655 


Qy 


316 


Db 


715 


Qy 


370 


Db 


772 


Qy 


430 


Db 


802 


Qy 


490 


Db 


849 


Qy 


542 


Db 


909 


Qy 


602 


Db 


952 


Qy 


659 


Db 


1006 


Qy 


716 


Db 


1049 


Qy 


770 


Db 


1104 


Qy 


830 


Db 


1161 



VSYLNITS" -VHANDGGLYKCIAKSKVGVA- - -EHSAKLNVYGLPYIRQMEKKAIVAGE 541 



I :|:: I I 



:| III: hi II II III 



I I 



■-FEQEPKDMTVDVGAAVLFDCRV-TGDPQPQITWK 260 

I : III :| III :|| 



■ - PVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAP 315 
' I : ||:: | | | | MM! I I |:| I 



I I II III II 



III :;: ::: :|| hi :| II II 
■ -EGTLHVDNIQKTNEGYYLCEAINGIGSGLS- - 



■ • EGQ • QDLLFPSYVSADGRT 369 

h :H : : 
)TPGEYKDLKKSDNIRVE— 771 



II :lll I hi hi h III: 
■ -PPEFTEKLRNQTARRGEPAVLQCEAKGEKPIGILWNMN 84 8 



■ - ITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVE 54 1 
I : :l III:: h : : M : f II I :: h 



II ::: : hi I I I h III:: 
• - VPEMPYALKVLDKSGRSVQLSWAQP - YDGNSPLDRY IIEFKRS • 951 



602 LGQTWFNIPDYVA— STEYRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSKPAA 658 

I : :!! ::: I h :| I III III I h : I 



rSEQLIKLEEVKT INSTAVRLFWKKRKLEEL ■ - -IDGYYIK 715 

I Ihl I I I :h II I I III: 
■ -PQNIKVEPV- • -NQTTMRVTWKPPPRTEWNGEILGYYVG 1048 



I : II :! I 



: I II 



III ill II I I hi: I :| : ||::| :::| 



--NERMSVTLFHLVTGMT-YKIRVAARSNGGVGV 865 

MM |:|: | | ::| | : || || 



II:: 
■YAPSDEW 1160 



Q9NBA1 
ID Q9f 
AC Q9t 
DT 
DT 
DT 



PRELIMINARY; 



PRT; 2016 AA, 



01-OCT-2000 (TrEMBLrel . 15, Created) 
01-OCT-20Q0 (TrEMBLrel , 15, Last sequence update) 
01-OCT-2000 (TrEMBLrel . 15, Last annotation update) 
DSCAM PRECURSOR. 

Drosophila melanogaster (Fruit fly). 

Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

Ephydroidea; Drosophilidae; Drosophila. 

NCBI_TaxID»7227; 

[1]' ; 

SEQUENCE FROM N..A. 

Schmucker D. , Clemens J.C., Shu H. , Worby C.A., Xiao J., Muda M., 
Dixon J.E., Zipursky S.L.; 

"Drosophila Dscam is an Axon Guidance Receptor Exhibiting 



Mon Jan 22 13:04:36 2001 



us-09-540-245a-17.rspt 
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RT Extraordinary Molecular Diversity/; 

RL Cell 101:671-684(2000). 

DR EMBL; AF260530; AAF71926.1; -. 

RW Signal. 

FT SIGNAL 1 28 POTENTIAL. 

SO SEQUENCE 2016 AA; 222124 MW; 95CF95488F2AD36C CRC64; 



Query Match 9.6%; Score 658.5; DB 5; Length 2016; 

Best Local Similarity 26.04; Pred. No. 7,7e-38; 

Matches 229; Conservative 130; Mismatches 369; Indels 153; Gaps 33; 

Qy 30 PVIIEHPIDVWSRGSPATLNC-GAKPSTAKITWYRDGQPVITNKEQVNSHRIVLDTGS 87 

III : : : I I I I I I :|:| lh : I : :: 

Db 429 PVIRQAFQEETMEPGPSVFLKCVAGGNP-TPEISWELDGKKIAHNDRYQVGQYVTVNGDV 487 

Qy 88 LFLLKVNSGRNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGE 147 

: I : I :l II |:| :: I INI: : : : II 
Db 488 VSYLNITS — VHANDGGLYKCIAKSKVGVA — EHSAKLNVYGLPYIRQMEKKAIVAGE 541 

Qy 148 MAVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDR-SDSGTYQCVA 206 
k :: | | |:| : | :|:: || : : :| III: 1:1 II II III 

■>b 542 TLIVTC-PVAGYPIDSIVWERDNRALPIN--RKQKVFPNGTLIIENVERNSDQATYTCVA 598 

Qy 207 NNMVGERVSNPARLSVFEKPK FEQEPKDMTVDVGAAVLFDCRV-TGDPQPQITWK 260 

II. : | |: | : III =11 II :| I 

Db 599 KNQEGYSARGSLEVQVMVPPQVLPFSFGES— -AADVGDIASANCWPRGDLPLEIRWS 654 

Qy 261 RKNEPM PVTRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAP 315 

: I: : I I I: : III Mil | | |:| | 

Db 655 LNSAPIVNGENGFTLVRLNKRTSLLNIDSLNAFHRGVYKCIATNPAGTSEYVAELQVNVP 714 

Qy 316 PSFQTRPADQSVPAGGTATFECTLVGQPSPAYFWSK EGQ-QDLLFPSYVSADGRT 369 

: I I !l IN I I: ill : : 

Db 715 PRWILEPTDRAFAQGSDARVECKADGFPKPQVTWRKAVGDTPGEYRDLRKSDNIRVE- • ■ 771 

Qy 370 RVSPTGTLTIEEVRQVDEGAYVCAGMNSAGSSLSKAALKATFETKGRVQRKRSKMGKQRQ 429 

III :: ::: :|| hi :| II II 
Db 772 — EGTLHVDS5QRTNEGYYLCEAINGIGSGLS 801 

Qy 430 RNVQSIIKYLISAVTGNTPARPPPTIEHGHQNQTLMVGSSAILPCQASGRPTPGISWLRD 489 

::| : I II :lll I hi hi h III: 

Db 802 — -AVIMISVQA PPEFTEKLRNQTARRGEPAVLQCEAKGEKP IGI LWNMN 848 

Qy 490 GLPID ITDSRISQHSTGSLHIADLKRPDTGVYTCIAKNEDGESTWSASLTVE 541 

: :| |i: :| II I :: h ::||:| II I :: h 

Db 849 NMRLDPRNDNRYTIREEILSTGVMSSLSIRRTERSDSALFTCVATNAFGSDDASINMIVQ 908 



Qy 542 DHTSNAQFVRMPDP5 
fcb 909 E 



IFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPD 601 

II ::: : hi I I I h III:: 
■ VPEMPYALKVLDKSG RSVQLS WAQ P - YDGNSPLDRY 1 1 EFKRS - 951 



Qy 602 LGQTWFNIPDYVA— STEYRIRGLRPSHSYMFVIRAENERGIGTPSVSSALVTTSKPAA 658 

:| I : :ll ::: I h :| I III III I h : I 
Db 952 -RASWSEIDRVIVPGHTTEAQVQRLSPATTYNIRIVAEN- -AIGTSQSSEAVTIIT- • -A 1005 

Qy 659 QVALSDRNKMDMAIAERRLTSEQLIRLEEVRTINSTAVRLFWRRRRLEEL- ■ -IDGYYIK 715 

: I I I I Ihl I I I :h II I I III: 

Db 1006 EEAPSGR PQNIKVEPV- - - NQTTMRVTWRPPPRTEWNGEILGYYVG 1048 

Qy 716 WRGPPRTNDNQYVNVTSPSTE NYWSNLMPFTNYEFFVIPYHSGVHSIHGAP-SN 769 

:: II : : II I : II =1 I : : I I I 
Db 1049 YR-LSNTNSSYVFETINFITEEGKEHNLELQNLRVYTQYSWI QAFNRIGAGPLSE 1103 

Qy 770 SMDVLTAEAPPSLPPEDVRIRMLNLTTLRISWKAPRADGINGILKGFQIVIVGQAPNNNR 829 

III II II I I hh I :| : lh:| :::| lh: 
Db 1104 EERQFTAEGTPSQPPSDTACTTLTSQTIRVGWVSPPLESANGVIKTYKW- ■ - YAPSDEW 1160 

Qy 830 NITT — NERAASVTLFHLVTGMT ■ YRIRVAARSNGGVGV 865 

I : 1:1 I: I : I I ::| I : || || 
Db 1161 YDETKRHYKRTASSDTVLHGLKRYTNYTMQVLATTAGGDGV 1201 



Search completed: January 22, 2001, 12:53:03 
Job time: 1984 sec 
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Best Available Copy 

Mon Jan 22 13:04:37 2001 us-09-540-245a-18.rag Page 1 



GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM protein - protein search, using sw model 



January 22, 2001, 12:18:45 ; Search time 233.01 Seconds 
(without alignments) 
242.281 Minion cell updates/sec 

US-09-540-245A-18 
8724 

1 MKWKHVPFLVMISLLSLSPN VLGGIERGEDI 



Title: 

Perfect score: 



Scoring table: 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 



irched: 268485 seqs, 34193795 residues 

Total number of hits satisfying chosen parameters: 268485 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 100* 
Listing first 45 summaries 

Database : A_Geneseq_36 : * 

1 : /S IDSl/gcgda ta/geneseq/geneseqp/AAl 9 8 0 . DAT : ' 
2 : /SIDSl/gcgdata/geneseq/geneseqp/AA1981 .DAT:' 
3 : /SIDSl/gcgdata/geneseq/geneseqp/AAl982 .DAT:* 
4 : /SlDSl/gcgdata/geneseq/geneseqp/AA1983 .DAT:' 
5 : /SIDSl/gcgdata/geneseq/geneseqp/AAl984 .DAT:* 
6 : /SIDSl/gcgdata/geneseq/geneseqp/AA1985 . DAT : 1 
7 : /S IDSl/gcgda ta/geneseq/geneseqp/AAl 986 .DAT:* 
t 8: /SIDSl/gcgdata/geneseq/geneseqp/M1987.DAT:* 

9 : /§JDSl/gcgdata/geneseq/geneseqp/AAl988 .DAT : * 
10 : /siDSl/gcgdata/geneseq/geneseqp/AA1989 . DAT 
11 : /SIDSl/gcgdata/geneseq/geneseqp/AA1990 . DAT 
12 : /SIDSl/gcgdata/geneseq/geneseqp/AA1991 . DAT 
13: /SIDSl/gcgdata/geneseq/geneseqp/AA1992 . DAT 
14: /SlDSl/gcgdata/geneseq/geneseqp/AA1993 .DAT 
15 : /S IDSl/gcgdata/geneseq/genes eqp/AA19 94 . DAT 

• 16 ; /SlDSl/gcgdata/geneseq/geneseqp/AA1995 .iAT 
17 : /SIDSl/gcgdata/geneseq/geneseqp/AAl996 .f AT 
18 ; /SIDSl/gcgdata/geneseq/geneseqp/AA1997 . DAT 
19: /SIDSl/gcgdata/geneseq/geneseqp/AAl998.DAT 
20 : /S IDSl/gcgdata/genes eq/genes eqp/AAl 999 . DAT 
21 : /SIDSl/gcgdata/geneseq/geneseqp/AA2000 . DAT 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



NO. 


Score Match Length 


DB 


ID 


Description 


PR 

XX 
PA 


14-N0V-1997; 97OS-0065543. 


1 


8724 


100.0 


1651 


20 


Y13566 


Human Robo 1 polyp 


(REGC ) ONIV CALIFORNIA. 


2 


8704 


99.8 


1649 


20 


Y08404 


Human ROBOl protei 


XX 




3 


3736.5 


42.8 


753 


20 


W83927 


Human T85 protein. 


PI 


Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 


4 


1592 


18.2 


1395 


20 


Y13563 


Drosophila Robo 1 


XX 




5 


1592 


18.2 


1395 


20 


Y08401 


Drosophila sp. ROB 


DR 


WPI; 1999-338008/28. 


6 


1500.5 


17.2 


1297 


20 


Y13565 


C. elegans Robo po 


DR 


N-PSDB; X55770.- 


■ 7 


1500,5 


17.2 


1297 


20 


Y08403 


C. elegans ROBO pr 


XX 




8 


1498.5 


17.2 


1380 


20 


Y08402 


Drosophila sp. ROB 


PT 


Modulation of Robo -Coram polypeptide interactions 


9 


1498 


17.2 


1381 


20 


Y13564 


Drosophila Robo 2 


XX 


' 10 


913 


10.5 


434 


20 


Y13567 


Human Robo 2 polyp 


PS 


Disclosure; Page 44-48; 56pp; English. 


11 


913 


10.5 


434 


20 


Y08405 


Human partial ROBO 


XX 




12 


856 


9.8 


985 


20 


Y41716 


Human PR0860 prote 


cc 


The invention relates to a method for modulating the amount of Comm 



13 


761 


8.7 


1257 


20 


W74152 


Human LI cell cd',:e 


14 


708.5 


8.1 


1571 


19 


W42087 


Human Down syndrcrr 


15 


707.5 


8.1 


1910 


19 


W42086 


Human Down syndrom 


16 


702.5 


8.1 


1192 


19 


W57900 


Protein of clone " 


17 


700.5 


8.0 


1299 


21 


Y40439 


Human Nr-CAM prote 


18 


691 


7.9 


148 


20 


Y13568 


Mouse Robo 1 poivu 


19 


691 


7.9 


148 


20 


Y08406 


Mouse partial r;C3u 


20 


682 


7.8 


1728 


12 


R13144 


Deleted in Coiorcc 


21 


681 


7.8 


1304 


19 


W59994 


Human neural ceil 


22 


667.5 


7.7 


1447 


16 


R68553 


Deleted in colore: 


23 


667.5 


7.7 


1447 


20 


Y33498 


Human DCC protein 


24 


664,5 


7.5 


1028 


19 


W29667 


Homo sapiens DL185 


25 


644.5 


7.4 


3117 


21 


Y53667 


Sequence gi/332818 


26 


596.5 


6.8 


1018 


17 


R87028 


Human contactin. 


27 


595,5 


6.8 


1018 


15 


R63759 


Human contactin (E 


28 


581,5 


6.7 


1911 


16 


R71726 


Human PTP-OB. Horn 


29 


581,5 


6.7 


1911 


18 


W27225 


Human protein tyro 


30 


581.5 


6.7' 


1911 


20 


W94027 


Human protein tyro 


31 


570 


6.5 


1018 


18 


W06485 


Rat contactin liga 


32 


569,5 


6.5 


4412 


21 


Y53666 


Sequence gi/101742 


33 


568 


6.5 


1242 


19 


W52287 


Rattus norvegicus 


34 


556 


6.4 


1496 


20 


W81030 


Melanoma associate 


35 


556 


6.4 


1496 


21 


Y70469 


Human p53 target m 


36 


551.5 


6.3 


1225 


19 


W52289 


Homo sapiens cdo t 


37 


549 


6.3 


1897 


21 


Y81785 


Human protein tyro 


38 


549 


6.3 


1897 


21 


Y56100 


LAR tyrosine phosp 


39 


521.5 


6.0 


1139 


19 


W37779 


Rattus norvegicus 


40 


515.5 


5.9 


1125 


19 


W52288 


Rattus norvegicus 


41 


507.5 


5,8 


1251 


19 


W37778 


Rattus norvegicus 


42 


506 


5,8 


1501 


16 


R72858 


Rat receptor type- 


43 


505 


5.8 


1070 


18 


W08747 


Human colon carcin 


44 


478.5 


5.5 


848 


21 


Y88565 


Human NCAM 140kD i 


45 


474 


5.4 


2387 


21 


Y53665 


Mechanical stress 



RESULT 1 
Y13566 

ID Y13566 standard; Protein; 1651 AA. 
XX 

AC Y13566; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE Human Robo 1 polypeptde. 

XX 

KW Comm polypeptide; Robo polypeptide; commissureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Homo sapiens, . 
XX 

PN W09925833-A1. 
XX 

PD 27-MAY-1999. 



PF 13-N0V-1998; 
XX 



98WO-US24327. 



Mon Jan 22 13:04:37 2001 



us-09-540-245a-18.rag 



CC (commissurelkss) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective anfount of Coiran polypeptide in contact with the cell, where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Comm in contact with the cell. 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The mathod can be used to screen for agents that modulate Robo:Comm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function . 
XX 

SQ Sequence 1651 AA; 



Query Hatch 100.0%; Score 8724; DB 20; Length 1651; 

Best Local Similarity 100.0*; Pred. No. 0; 

Conservative 0; Mismatches 0; Indels 0; ( 



Matches 


Qy 


1 


Db 


1 




61 




61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


•421 


Db 


421 


I 


481 




481 


Qy 


541 


Db 


541 


Qy 


601 


Db 


601 


Qy 


661 


Db 


661 


Qy 


721 


Db 


721 


Qy 


781 


Db 


781 



Qy 


841 


FLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVSLAQQISDWKOPAFI 


900 


Db 


841 


flvpgirysvevaastgagsgvksepqfiqldahgnpvspedqvslaqqisdvvkqpafi 


900 


Qy 


901 


AGIGMCWIILMVFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQR6GEAVSSGG 


960 


Db 


901 


agigaacwiilmvfsiwlyrhrkkrngltstyagirkvpsftftptvtyqrggeavssgg 


960 


Qy 


961 


RPGLLNISEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSKLTTYSRPADCIANYN 


1020 


Db 


961 


rpgllnisepaaqpwladtwpntgnnhndcsiscctagngnsdsnlttysrpadcianyn 


1020 


Ov 


1021 


NQLDNKQTNLULPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQ 


1C8C 


Db 


1021 


nqldnkqtnlriilpestvygdvdlsnkinemktfnspnlkdgrfvnpsgqptpyattqliq 


1030 


Qy 


1081 


SNLSNNMNNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYNQ 


1140 


Db 


1081 


snlsnnmnngsgdsgekhwkplgqqkqevapvqyniveqnklnkdyrandtvpptipynq 


1140 


Qy 


1141 


SYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKVPKQGGMNWADLLPPPPAHPPPHSNSE 


1200 


Db 


1141 


sydqntggsynssdrgsstsgsqghkkgartpkvpkqggmnwadllppppahppphsnse 


1200 


Ov 


1201 


EYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPTPPVRGAASSPAAVSYSHQST 


1260 


Db 


1201 


eynisvdesydqempcpvpparmylqqdeleeeedergptppvrgaasspaavsyshqst 


1260 


Ov 


1261 


ATLTPSPQEELQPMLQDCPEETGHMQHQPDRRRQPVSPPPPPRPISPPHTYGYISGPLVS 


1320 


Db 


1261 


atltpspqeeiqpmlqdcpeetghmqhqpdrrrqpvsppppprpispphtygyisgplvs 


1320 


Ov 


1321 


DMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWGSASEE 


1380 


Db 


1321 


dmdtdapeeecdeadmevakmqtrrlllrgleqtpassvgdlessvtgsniingwgsasee 


1380 


Ov 


1381 


DNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRP 


1440 


Db 


1381 


dnissgrssvsssdgsfftdadfaqavaaaaeyaglkvarrqmqdaagrrhfhasqcprp 


1440 


Qy 


1441 


TSPVSTDSNMSAAVMQKTRPAKKLRHQPGHLRRETYTDDLPPPPVPPPAIKSPTAQSKTQ 


1500 


Db 


1441 


tspvstdsnmsaavmqktrpakklkhqpghlrretytddlppppvpppaiksptaqsktq 


1500 


Qy 


1501 


LEVRPVWPKLPSMDARTDRSSDRKGSSYKGREVLDGRQWDMRTNPGDPREAQEQONDG 


1560 


Db 


1501 


levrpvwpkipsmdartdrssdrkgssykgrevldgrqwdmrtnpgdpreaqeqqndg 


1560 


Qy 


1561 


RGRGNRAAKRDLPPAKTHLIQEDILPYCRPTFPTSNNPRDPSSSSSMSSRGSGSRQREQA 


1620 


Db 


1561 


kgrgnkaakrdlppakthliqedilpycrptfptsnnprdpsssssmssrgsgsrqreqa 


1620 


Qy 


1621 


NVGRRNIAEMQVLGGYERGEDNNEELEETES 1651 




Db 


1621 


nvgrrniaemqvlggyergednneeleetes 1651 





RESULT 2 
Y08404 

ID Y08404 standard; Protein; 1649 AA. 
XX 1 
AC Y08404; 
XX 

DT 24-JUL-1999 •(first entry) 
XX 

Human ROBOl protein. 



XX 



ROBOl; R0B02; roundabout; nerve guidance; human; murine; cell function; 
cell morphology; screening assay. 



XX 

OS Homo 



PN WO9920764-A1. 



Best Available Copy 
Mon Jan 22 13:04:37 2001 us-09-540-; 



■245a-18.rag 



Page 3 



IO-OCT-1998; 98WO-US22164. 



14-NOV-1997; 97DS-0971172. 
20-OCT-1997; 97US-0062921 . 



(REGC ) UNIV CALIFORNIA. 

Goodman CS, Kidd T, Mitchell KJ, Tear G; 

WPI; 1999-312615/26. 
N-PSDB; X08404. 

Robo polypeptides, a new immunoglobulin superfamily member 
Claim 1; Page 65-71; 80pp; English. 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C. elegans, human and murine samples, The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides. The 
probes and primers are also useful in screening assays. 

Sequence 1649 AA; 



Query Match 99.8%; Score 8704; I 

Best Local Similarity 99.9%; Pred. No. 0; 
Matches 1649; Conservative 0; Mismatches 



i 20; Length 1649; 

indels 2; Gaps 

Qy 1 MKWKHVPFLVMISLLSLSPNHLFLAQLIPDPEDVERGNDHGTPIPTSDNDDNSLGYTGSR 60 

Db 1 mkwkhvpflvmisllslspnhlflaqlipdpedvergndhgtpiptsdnddnslgytgsr 60 

Qy 61 LRQEDFPPRIVEHPSDLIVSKGEPATLNCRAEGRPTPTIEWYKGGERVETDKDDPRSHRM 120 

Db 61 lrqedfpprivehpsdlivskgepatlnckaegrptptiewykggervetdkddprshrm 120 

Qy 121 LLPSGSIFFLRIVHGRKSRPDEGVYVCVMNYLGEAVSHNASLEVAILRDDFRQNPSDVM 180 

Db 121 llpsgslfflrivhgrksrpdegvyvcvarnylgeavshnaslevailrddfrqnpsdvm 180 

^ 181 VAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITIRGGKLMITYTRKSDAGKYVC 240 

Db 181 vavgepavmecqpprghpeptiswkkdgsplddkderitirggklmitytrksdagkyvc 240 

Qy 241 VGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDG 300 

Db 241 vgtnmvgeresevaeltvlerpsfvkrpsnlavtvddsaefkceargdpvptvrwrkddg 300 

Qy 301 ELPKSRyEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPRDQ 360 

Db 301 elpksryeirddhtlkirkvtagdmgsytcvaenmvgkaeasatltvqepphfwkprdq 360 

Qy 361 WALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQR 420 

Db 361 vvalgrtvtfqceatgnpqpaifwrregsqnllfsyqppqsssrfsvsqtgdltitnvqr 420 

Qy 421 SDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVATG 480 

Db 421 sdvgyyicqtlnvagsiitkaylevtdviadrpppvirqgpvnqtvavdgtfvlscvatg 480 

Qy 481 SPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSAYI 540 

Db 481 spvptilwrkdgvlvstqdsrikqlengvlqiryaklgdtgrytciastpsgeatwsayi 540 

Qy 541 EVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEAFS 600 

Db 541 evqefgvpvqpprptdpnlipsapskpevtdvsrntvtlswqpnlnsgatptsyiieafs 600 



Qy 841 FLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVSLAQQISDWKQPAFI 900 



Qy 601 HASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPT 660 

Db 601 hasgsswqtvaenvktetsaikglkpnaiylflvraanaygisdpsqisdpvktqdvlpt 660 

Qy 661 SQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHGE 720 

Db 661 sqgvdhkqvqrelgnavlhlhnptvlssssievhwtvdqqsqyiqgykilyrpsganhge 720 

Qy 721 SDWLVFEVRTPAKNSWIPDLRRGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPPQ 780 

Db 721 sdwlvfevrtpaknswipdlrkgvnyeikarpffnefqgadseikfaktleeapsappq 780 

Qy 781 GVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHINKTVDGSTFSWIP 840 

Db 781 gvtvskndgngtailvswqpppedtqngmvqeykvwclgnetryhinktvdgstfswip 840 



Db 841 flvpgirysvevaastgagsgvksepqfiqldahgnpvspedqvslaqqisdvvkqpafi 900 

Qy 901 AGIGMCWIILMVFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGG 960 

Db 901 agigaacwiilmvfsiwlyrhrkkrngltstyagirkvpsftftptvtyqrggeavssgg 960 

Qy 961 RPGLLNISEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYN 1020 

Db 961 rpgllnisepaaqpwladtwpntgnnhndcsiscctagngnsdsnlttysrpadcianyn 1020 

Qy 1021 NQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQ 1080 

Db 1021 nqldnkqtnlmlpestvygdvdlsnkinemktfnspnlkdgrfvnpsgqptpyatt--iq 1078 

Qy 1081 SNLSNNMNNG3GDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYNQ 1140 

Db 1079 snlsnnmnngsgdsgekhwkplgqqkqevapvqyniveqnklnkdyrandtvpptipynq 1138 

Qy 1141 SYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKVPKQGGMNWADLLPPPPAHPPPHSNSE 1200 

Db 1139 sydqntggsynssdrgsstsgsqghkkgartpkvpkqggmnwadllppppahppphsnse 1198 

Qy 1201 EYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPTPPVRGAASSPAAVSYSHQST 1260 

Db 1199 eynisvdesydqempcpvpparmylqqdeleeeedergptppvrgaasspaavsyshqst 1258 

Qy 1261 ATLTPSPQEE'LQPMLQDCPEETGHMQHQPDRRRQPVSPPPPPRPISPPHTYGYISGPLVS 1320 

Db 1259 atltpspqeelqpmlqdcpeetghmqhqpdrrrqpvsppppprpispphtygyisgplvs 1318 

Qy 1321 DMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGKGSASEE 1380 

Db 1319 dmdtdapeeeedeadmevakmqtrrlllrgleqtpassvgdlessvtgsmingwgsasee 1378 

Qy 1381 DNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRP 1440 

Db 1379 dnissgrssvsssdgsfftdadfaqavaaaaeyaglkvarrqmqdaagrrhfhasqcprp 1438 

Qy 1441 TSPVSTDSNMSAAVMQKTRPAKKLRHQPGHLRRETYTDDLPPPPVPPPAIKSPTAQSKTQ 1500 

Db 1439 tspvstdsnmsaavmqktrpakklkhqpghlrretytddlppppvpppaiksptaqsktq 1498 

Qy 1501 LEVRPVWPKLPSMDARTDRSSDRKGSSYKGREVLDGRQWDMRTNPGDPREAQEQQNDG 1560 

Db 1499 levrpwvpkl'psitidartdrssdrkgssykgrevldgrqvvdmrtnpgdpreaqeqqndg 1558 

Qy 1561 KGRGNKAAKRDLPPAKTHLIQEDILPYCRPTFPTSNNPRDPSSSSSMSSRGSGSRQREQA 1620 

Db 1559 kgrgnkaakrdlppakthliqedilpycrptfptsnnprdpsssssmssrgsgsrqreqa 1618 

Qy 1621 NVGRRNIAEMQVLGGYERGEDNNEELEETBS 1651 

Db 1619 nvgrrniaemqvlggyergednneeleetes 1649 



Mon Jan 22 13:04:37 2001 



us-09-540-245a-18.rag 
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RESULT 3 
W83927 

ID W83927 standard; Protein; 753 AA. 

AC W83927; 
XX 

DT 01-MAR-1999 (first entry) 

XX ' 

DE Human T85 proteins 
XX 

KW TBS; FHMB-6D4; FMHV-SD4; human; neurological disorder; therapy; 

KW diagnosis. N 



OS Homo sapiens. 
XX 

fh Key 

FT Peptide 
FT 

FT Protein 



Region 



FT Region 

FT 
FT 

FT Region 

FT 

FT Region • 
FT 

FT Region 
FT 

FT Region 

FT 

FT Region 

FT 

FT Peptide 

FT 

FT Domain 

FT 

FT 

PN WO9848051-A2. 
XX 

PD 29-OCM998. 



Location/Qualifiers 
1..20 

/label- Sig_peptide 
21. .753 

/label- Mat_protein 
525,. 610 

/note- "has homology to a fibronectin type III 
domain" 

638.. 727 

/note* "has homology to a fibronectin type III 
domain" 

43. .101 

/note- "has homology to a Ig superfamily domain" 
145.. 203 

/note- "has homology to a Ig superfamily domain" 
237.. 298 

/note* "has homology to a Ig superfamily domain" 
329.. 394 

/note- "has homology to a Ig superfamily domain" 
433. .491 

/note- "has homology to a Ig superfamily domain" 
247., 249 

/note- "RGD motif" 
516.. 600 

/note- "cytokine receptor homology N-terminal 
domain" 



PF 17-APR-1998; 98WO-US07714. 

m 10-OCT-1997; 97US-0062017. 
■ 18-APR-1997; 97US-0044746. 



(MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 

Holtzman D, McCarthy SA; 

WPI; 1999-024021/02. 
N-PSDB; V69278, 



New isolated human FTHMA-070 and T85 proteins - used to develop 
products for the diagnosis and therapy of disorders involving 
cellular processes, e.g. neuronal development. 

Claim 31; Fig 3; 127pp; English. 

This is the amino acid sequence of a novel human protein designated 
T85, and also referred to as FMHB-6D4 and FMHB-SD4. T85 cDNA (see 
V69278) was identified in a human foetal brain cDNA library using a 
screen designed to identify genes encoding proteins having a 
functional signal sequence. T85 nucleic acids and polypeptides of 
the invention are useful as modulating agents in regulating a 
variety of cellular processes. They can be used for identifying 
compounds which bind to or modulate the activity of the polypeptides 



(claimed). They can also be used in screening assays, detection 
assays (e.g. chromosomal mapping, tissue typing, forensic biology), 
predictive medicine (e.g. diagnostic assays, prognostic assays, 
monitoring clinical trials, and pharmacogenomics), and methods of 
treatment (e.g.; therapeutic and prophylactic) e.g. for neurological 



42.8%; 



Sequence 753 AA; 



Query Match 



Score 3736.5; DB 20; Length 753; 
il Similarity 99.4%; Pred. No. 1.6e-188; 
716; Conservative 1; Mismatches 0; Indels 3; Ga[ 

CGSRLRQEDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPR ] 
:|||||l!llillll!!lllllll!lllllll!MIIIIII!!l!!ll!l!IIIIM!ll 
sgsrlrqedfpprivehpsdlivskgepatlnckaegrptptiewykggervetdkddpr / 

3HRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNP 1 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



IMIIllllllllllllMIIIIIIIIIIIIIIIIMIIIMIIIMIIIIIMIIIIII 



Mllllllll'lllllllllllllllllinilllllllllllllllllllllllllllll 



iiiiiimiiiiiiiiiiiiiiiiimiiimiiimiiiiiimi! inn 



IIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIII 



IIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIMI 



iiimmimiimimiiiiiiiiiiiiimimiiMiiiiiiiiiiiii 



IIIIMIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 



iiiiiiiiiifiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 



iiimmiMimiiimmiiiiiiimiiiiimiiiiiMiiimiMi 



RESULT 4 
Y13563 

ID Y13563 standard; Protein; 1395 AA, 
XX 

AC Y13563; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE Drosophila Robo 1 polypeptde. 

XX 



Matches 


Qy 


57 


Db 


18 


Qy 


117 


Db 


78 


Qy 


177 


Db 


138 


Qy 


237 


Db 


198 


Qy 


297 


Db 


258 


Qy 


354 


Db 


318 


Qy 




Db 


378 


Qy 


474 


Db 


438 


Qy 


534 


Db 


498 


Qy 


594 


Db 


558 


Qy 


654 


Db 


618 


Qy 


714 


Db 


678 



Best Available Copy 
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KW Coram polypeptide; Robo polypeptide; commissureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Drosophila sp. 
XX 

PN W09925833-A1. 
XX 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; 98WO-DS24327. 

PR 14-NOV-1997; 97US-Q065543, 
XX 

PA (REGC ) UNIV CALIFORNIA, 
XX 

PI Goodman c, Kid T, Mitchell KJ, Russell C, Jear 6; 

XX 

WPI; 1999-338008/28, 
N-PSDB; X55767. 



Modulation of Robo-Comm polypeptide interactions 

Disclosure;/Page 30-33; 56pp; English, 

The invention relates to a method for modulating the amount of Comm 
(commissureless) polypeptide in contact with a cell expressing active 
Robo (roundabout) on its surface. The method comprises modulating the 
effective amount of Comm polypeptide in contact vith the cell, where the 
amount of expressed active Robo is specifically modulated inversely with 
the modulation of the effective amount of Com in contact with the cell. 
The method is used to modulate the amount of active Robo expressed on a 
cell. The method can be used to screen for agents that modulate RoboiComm 
interactions. This is particularly useful for modulating nerve cell 
function. 

Sequence 1395 AA; 



Query Match 18.2*; Score 1592; DB 20; Length 1395; 

Best Local Similarity 30.0%; Pred. No. 1.2e-75; 

Matches 420; Conservative 187; Mismatches 483; Indels . 308; Gaps 40; 

Qy 68 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 127 

I llllllll 11 : 1 Mill: I I I :: :|||: |:| 
Db 56 priiehptdlvvkknepatlnckvegkpeptiewfkdgepvst--nekkshrvqflcdgal 113 

»*128 FFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 187 
II I : 1:1 f| | | llhl :|:MI : 1 1 1 : : 1 : 1 1 1 1 1 1 II II II I 
Db 114 ffyrtmqgkkeq-dggeywcvaknrvgqavsrhaslqiavlrddfrvepkdtrvakgeta 172 

Qy 188 VMECQPPRGHPEPTISWKKDGSPLDD KDERIT I • RGGKLMITYTRKSDAGKYV 239 

::H 11:1 Nil: I III Mil I: I II hh I I I 

Db 173 1 lecgppkgipeptl iwikdgvplddlkams f gas srvr ivdggnll isnvep idegnyk 232 

Qy 240 CVGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDD 299 

I: hi III |:| :| ];| I : : :| I III I I |:|:: 
Db 233 ciaqnlvgtressyaklivqvkpyfmkepkdqvmlygqtatfhcsvggdpppkvlwkkee 292 

Qy 300 GELPKSRYEI-RDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPR 358 

Ml! I h :|:| :| I hll I I II: I |:| I Ihl :| 
Db 293 gnipvsrarilhdeksleisnitptdegtyvceahnnvgqisaraslivhappnftkrps 352 

Qy 359 DQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSyQPPQSSSRFSVSQTGDLTITNV 418 

I I I I MM MM! Ml hi 1,1 I I: I I MM 
Db 353 nkkvglngwqlpcmasgnpppsvfwtkegvstlmf— prisshgrqyvaadgtlqitdv 409 

Qy 419 QRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVA 478 

" I 111:1 M I : MM: | :||||;|: || |||: | | 
Db 410 rqedegyyvcsafsvvdsstvrvflqvssvderpppiiqigpanqtlpkgsvatlpcra 468 

Qy 479 TGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSA 538 

Ihl I I I II I :| :: I:: M |:| III I I :|:| 
Db 469 tgnpsprikwfhdghavqa-gnrysiiqgsslrvddlqlsdsgtytctasgergetswaa 527 



Qy 539 YIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSW- - -QPNLNSGATPTSYI 595 

: I:: I ." I lh |: I |:| Mil :::| II: I 
Db 528 tltvekpg-stslhraadpstypappgtpkvlnvsrtsislrwaksqekpgavgpiigyt 586 

Qy 596 IEAFSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQ 655 

M II : I I I || I 1 : 1 1 1 1 1 I III II M: Ml 

Db 587 veyfspdlqtgwivaahrvgdtqvtisgltpgtsyvflvraentqgisvpsglsnvikti 646 

Qy 656 DV-LPTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWT-VDQQSQYIQGYKILYR 712 

: : I : I : I : : :::|:: : I I :|::| :| |: 
Db 647 eadfdaasandlsaartlltgksvelidasainasavrlewmlhvsadekyveglrihyk 7 Cf 

Qy 713 — PSGANHGESDWLVFEVRTPARNSWIPDLRKGVNYEIRARPFFNEFQGADSEIKFA 7?S 

II I' 1:1 I: MM || III ;l I I 

Db 707 dasvpsaqyhs- itvmdasaesfvvgnlkkytkyeffltpffetiegqpsnskta 760 

Qy 769 KTLEEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKV-WCLGNETRYHIN 827 

I I: lllll' : : llllllll II : lh II : I 
Db 761 ltyedvpsap'pdniqigmynqtagwvrwtpppsqhhngnlygykievsagntmkvlan 818 

Qy 828 KTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLD AH — 874 

M: M lh: I I III :: I || | |:| : M I 
Db 819 mtlnatttsvllnnlttgavysvrlnsftkagdgpyskpislfmdpthhvhpprahpsgt 878 

Qy 875 • GNPVSPED-QVSLAQQISDWKQPAFIAGIGAACWIILMVFSIWL 918 

II : M : :: M : I |::::| : I 

Db 879 hdgrhegqdltyhnngn-ippgdinptthkkttdylsgp wlmvlvcivll 927 

Qy 919 -YRHRRKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGGRPGLLNI 967 

III: :| : I :| I Ml 

Db 928 vlvisaaismvyfkrkhq mtkelghlsvvsdneitalni 966 

Qy 968 SEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNRQ 1027 

: : h ::| : :| hi I :: : :|||| 
Db 967 nskesl-wi-*- dhhrgwrtadtdkdsglseskllshvnssq--snynns 1010 

Qy 1028 TNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQSNLSNNM 1087 

I I Ml I It: lllllll :| :: I 

Db 1011 dggtdyaevdtrnlttfyncrkspd nptpyattmiigtsssetc 1054 

Qy 1088 NNGSGDSGERHWRPLGQQKQEVAPVQYNIVEQRKLNKDYRANDTVPPTIPYNQSYDQNTG 1147 

: I :P II 
Db 1055 tkttsisadk dsgth 1069 

Qy 1148 GSYNSSDRGSSTSGSQGHKKGARTPKVPKQGG MNWADLLPPPPAHPPPHS- 1197 

h : I ' : I II MM: Mill Mil I 

Db 1070 spysdafag qvpavpvvksnylqypvepinwseflppppehpppsst 1115 

Qy 1198 : NSEEYNISVDES YDQEMPCPVPPARMY 1224 

llhl : II :| 

Db 1117 ygyaqgspessrkssksagsgistnqsilnasihssssggfsawgvspqyavacppenvy 1176 

Qy 1225 LQQDELEEEEDERGPTPPVRGAASSP-AAVSYSHQSTATLTPSPQEELQPMLQDCPEETG 1283 

hi Ml: I: MM III I 
Db 1177 snplsavaggtqnryqitptnqh-ppqlpayfattg 1211 

Qy 1284 HMQHQPDRRR QPVSPPPP-PRP-- 1304 

h I h MM I I 

Db 1212 pggavppnhlpfatqrhaaseyqaglnaarcaqsracnscdalatpspmqppppvpvpeg 1271 

Qy 1305 -ISPPHTYGYISGPLVSD 1321 

.11 :■.!!: 
Db 1272 wyqpvhpnshpmhptssn 1289 



RESULT 5 

Y08401 ! 

ID Y08401 standard; Protein; 1395 AA. 

XX 

AC Y08401; 

XX ' 

DT 24-JUL-1999 (first entry) 
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Drosophila sp, ROBOl protein. 

R0B01; ROB02; roundabout; nerve guidance; human; murine; cell function; 
cell morphology; screening assay. 

Drosophila sp. 

WO9920764-A1. 

29-APR-1999. 

20-OCT-1998; 98WO-US22164. 

14-NOV-1997; 97CS-0971172. 
20-OCT-1997; 97OS-0062921. 

(REGC ) DNIV CALIFORNIA. 

Goodman CS, Kidd T, Mitchell KJ, Tear G; 

WPI; 1999-312615/26. 
N-PSDB; X57250. 

PT Robo polypeptides, a new immunoglobulin superfamily member 



Claim 1; Page 45-49; 80pp; English. 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C. elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides . The 
probes and primers are also useful in screening assays. 

Sequence 1395 AA; 



Query Match 18.2%; Score 1592; DB 20; Length 1395; 

Best Local Similarity 30.0%; Pred. No. 1.2e-75; 

Matches 420; Conservative 187; Mismatches 483; Indels 308; Gaps 

Qy 68 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 127 

1 1 1 : M 1 : 1 1 : 1 I i 1 1 1 1 1 II Ihl llllhl II I I :: :|||: hi 
Db 56 priiehptdlvvkknepatlnckvegkpeptiewfkdgepvst-nekkshrvqfkdgal 113 

Qy 128 FFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDPRQNPSDVMVAVGEPA 187 
II I : |:| : I I I III: :|:||l : 1 1 1 : : 1 : 1 1 1 ! 1 1 II II II I 
114 ffyrtmqgkkeq-dggeywcvaknrvgqavsrhaslqiavlrddfrvepkdtrvakgeta 172 

188 VMECQPPRGHPEPTISWRKDGSPLDD KDERIT I - RGGKLMITYTRRSDAGKYV 239 

. ::ll 11:1 llll: I III llll I: I II l:|: I I I 

173 llecgppkgipeptliwikdgvplddlkamsfgassrvrivdggnllisnvepidegnyk 232 

240 CVGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDD 299 

I: 1:11 III hi I :| |:| I : : :| II III I I hh: 
233 ciaqnlvgtressyakllvqvkpyfmkepkdqvmlygqtatfhcsvggdpppkvlwkkee 292 

300 GELPKSRYEI-RDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPR 358 

Mil I I: :|:| :| I hi I I I ||: I |:| I l|:| :| 
293 gnipvsrarilhdeksleisnitptdegtyvceahnnvgqisaraslivhappnftkrps 352 

359 DQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRPSVSQTGDLTITNV 418 

:: I I I I hill I::ll :|l M I I I I: I I 11 = 1 
353 nkkvglngvvqlpcmasgnpppsvfwtkegvstlmf---pnsshgrqyvaadgtlqitdv 409 

419 QRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVA 478 

:: I III: :| I : :|:|: I :||||:|: II III; I I I 

410 rqedegyyvcsafswdsstvrvflqvssv-derpppiiqigpanqtlpkgsvatlpcra 468 

479 TGSPVPTILWRKDGVLVSTQDSRIRQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSA 538 
11:1 I I I II I :| :: |:: :| M III II II :hl 



Db 469 tgnpsprikwfhdghavqa-gnrysiiqgsslrvddlglsdsgtytctasgergetswaa 527 

Qy 539 YIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSW— QPNLNSGATPTSYI 595 

: I:: I . I II: I: I hi =111 :"l I I : I 
Db 528 tltvekpg-stslhraadpstypappgtpkvlnvsrtsislrwaksqekpgavgpiigyt 586 

Qy 596 IEAFSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQ 655 

:| II :.■ II llll l:IMI I III II :|: :|| 
Db 587 veyfspdlqtgwivaahrvgdtqvtisgltpgtsyvflvraentqgisvpsglsnvikti 646 

Qy 656 DV-LPTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWT--VDQQSQYIQGYKILYR 712 

: : f : | : | : : :::|:: : I I :|::| :| h 
Db 647 eadfdaasandlsaartlltgksvelidasainasavrlewmlhvsadekyveglrihyk 706 

Qy 713 — PSGANHGESDKLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFA 768 

II h I : I I: :hl II III :| I II 

Db 707 dasvpsaqyhs itvmdasaesfvvgnlkkytkyeffltpffetiegqpsnskta 760 

I '{ 

Qy 769 KTLEEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKV-WCLGNETRYHIN 827 

I I: Mil) : : I II I I III II : lh II : I 
Db 761 ltyedvpsappdniqigmynqtagwvrwtpppsqhhngnlygykievsagntmkvlan 818 

Qy 828 KTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLD AH 874 

I:: : lh: I I III : : I II I hi : :l II 
Db 819 mtlnatttsvilnnlttgavysvrlnsftkagdgpyskpislfmdpthhvhpprahpsgt 878 

Qy 875 - — GNPVSPED-QVSLAQQISDWKQPAFIAGIGAACWIILMVFSIWL 918 

- || : I I : :: :| : I h = ::l : I 

Db 879 hdgrhegqdltyhnngn-ippgdinptthkkttdylsgp wlmvlvcivll 927 

Qy 919 TYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGGRPGLLNI 967 

-III : :| : I :l I III 

Db 928 vlvisaaismvyfkrkhq mtkelghlswsdneitalni 966 

Qy 968 SEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNKQ 1027 

: : |: ■■ ::| : =1 hi I :: : :llll 
Db 967 nskesl-wi- dhhrgwrtadtdkdsglseskllshvnssq--snynns 1010 

Qy 1028 TNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQSNLSNNM 1087 

I I :M I lh ' lllllll :l :: I 

Db 1011 dggtdyaevdtrnlttfyncrkspd nptpyattmiigtsssetc 1054 

Qy 1088 NNGSGDSGEK1MPLGQQKQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYNQSYDQNTG 1147 

: I :|; I I 

Db 1055 tkttsisadk: dsgth 1069 

Qy 1148 GSYNSSDRGSSTSGSQGHKKGARTPKVPKQGG MNWADLLPPPPAHPPPHS- 1197 

I: : I ; : I II :|h: Mill llll I 

Db 1070 spysdafag- qvpavpvvks ny lqypvep inwsef lppppehpppss t 1116 

Qy 1198 * NSEEYNISVDES YDQEMPCPVPPARMY 1224 

III: : II : 

Db 1117 ygyaqgspessrkssksagsgistnqsilnasihssssggfsawgvspqyavacppenvy 1176 

Qy 1225 LQQDELEEEEDERGPTPPVRGAASSP-AAVSYSHQSTATLTPSPQEELQPMLQDCPEETG 1283 
: hi :lh h :lh III II 

1177 : snplsavaggtqnryqitptnqh-ppqlpayfattg 1211 



Db 

Qy 1284 



Db 



QPVSPPPP-PRP" 1304 

: | h llll I I 

1212 pggavppnhlpfatqrhaaseyqaglnaarcaqsracnscdalatpspmqppppvpvpeg 1271 



Qy 1305 -ISPPHTYGYISGPLVSD 1321 

II I h 
Db 1272 wyqpvhpnshpmhptssn 1289 



RESULT 6 
Y13565 

ID Y13565 standard; 
XX 

AC Y13565; 
XX 



Protein; 1297 AA, 
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xx 

PT 
XX 
PS 
XX 

cc 
cc 
cc 
1 cc 

i cc 

' cc 
cc 
cc 
cc 
cc 

XX 
SQ 



30-JUL-1999 (first entry) 
C elegans Robo polypeptde. 

Comm polypeptide; Robo polypeptide; commissureless; roundabout; 
modulation; nerve cell function. 

Caenorhabditis elegans. 

W09925833-A1. 

27-MAM999. 

13- NOV-1998; 98WO-US24327. 

14- NOV-1997; 9^5-0065543. 
(REGC ) UNIV CALIFORNIA. 

Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 

WPI; 1999-338008/28. 
N-PSDB; X55769. 

Modulation of Robo-Comm polypeptide interactions 



Disclosure; Page 38-39; 56pp; English. 

The invention relates to a method for modulating the amount of Coram 
(commissureless) polypeptide in contact with a cell expressing active 
Robo (roundabout) on its surface. The method comprises modulating the 
effective amount of Comm polypeptide in contact with the cell, where the 
amount of expressed active Robo is specifically modulated inversely with 
the modulation of the effective amount of Comm in contact with the cell. 
The method is used to modulate the amount of active Robo expressed on a 
cell. The method can be used to screen for agents that modulate Robo: Comm 
interactions, This is particularly useful for modulating nerve cell 
function. 

Sequence 1297 AA; 



Query Match ■ 17.2%; Score 1500.5; DB 20; Length 1297; 

Best Local Similarity 32,11; Pred. No. 7e-71; 

Matches 408; Conservative 166; Mismatches 480; Indels 219; Gaps 36; 

»68 PRIVEHPSDLI VSKGEPATLNCRAEG RPT PT I EWYKGGERVETDKDDPRSHRMLLP SGSL 127 
! I hill h:!l:l HUM 1= I I III |: I |:|: M I : : I :||| 
ud 30 pviiehpidvvvsrgspatlncgak-pstakitwykdgqpvitnkeqvnshrivldtgsl 88 

Qy 128 FFLRIVHGRKSR-PDEGVYVCVARNYIGEAVSHNASLEVAILRDDFRQNPSDWAVGEP 186 

I.I:: I: : I I I III I II I: ll::|:||:||| I I II 
Db 89 fllkvnsgkngkdsdagayycvasnehgevksnegslklamlredfrvrprtvqalggem 148 

Qy 187 AVMECQPPRGHPEPIISWKKDGSPLDDKD-ERITIRG-GKLMITYTRRSDAGRYVCVGTN 244 

Ihll llll III :H:II I :| I |: I |:| :||:| I II | 
Db 149 avlecspprgfpepwswrkddkelriqdmprytlhsdgnliidpvdrsdsgtyqcvann 208 

Qy 245 MVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDGELPK 304 

Mill I I hi hi I : I :: I I : I I III I : |:: : :| 
Db 209 mvgervsnparlsvfekpkfeqepkdmtvdvgaavlfdcrvtgdpqpqitwkrknepmpv 268 

Qy 305 SR-YEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPRDQWA 363 

:| I :|: hi :| I I I I I I I llll I II II I II II I 
Db 269 trayiakdnrglriervqpsdegeyvcyarnpagtleasahlrvqappsfqtkpadqsvp 328 

Qy 364 LGRTVTFQCEATGNPQPAIFWRREGSQNLLF-SYQPPQSSSRFSVSQTGDLTITNVQRSD 422 

I I Ihl II II II :N 1:111 II : I II II III |:: I 
Db 329 aggtatfectlvgqpspayfwskegqqdllfpsy-vsadgrtkvsptgtltieevrqvd 386 

Qy 423 VGYYICQTLNVAGSIITKAYLE VTDV 448 

I hi :l III ::H I: II 
Db 387 egayvcagmnsagsslskaalkatfetkgrvqkkkskmgkqkqknvqsiikylisavtgn 446 



Qy 449 IADRPPPVIRQGPVNQTVAVDGTFVLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLENG 508 

:|ll I • I III: I : : I I I : I I I I t :lh : llll I I 
Db 447 tpakppptiehghqnqtlmvgssailpcqasgkptpgiswlrdglpiditdsrisqhstg 506 

Qy 509 VLQIRYAKLGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSRPE 568 

II llll Mill 11:1111 : I:: II II: l|:|::| 
Db 507 slhiadlkkpdtgvytciaknedgestwsasltvedhtsnaqfvrmpdpsnfpssptqpi 566 

Qy 569 VTDVSRNTVTLSWQPNLNSGATP-TSYIIEAFSHASGSSWQTVAENVKTETSAIKGLKPN 627 

: :h I 1 I III II III: :l I :| : : I : llllll: 
Db 567 ivnvtdtevelhwnapstsgagpitgyiiqyyspdlgqtwfnipdyvasteyrikglkps 626 

Qy 628 AIYLFLVRAANAYGISDPSQISDPVRT QDVLPT SQGVDHKQVQREL - GNAVLHLH 681 

hl::|| 'i 1 1 1 1 I I I II :| :: I :: I 
Db 627 hsymfviraenekgigtpsvssalvttskpaaqvalsdknkmdmaiaekrltseqlikle 686 



682 NPTVLSSSSIEVHWTVDQQSQYIQGYKILYR-PSGANHGESDWLVFEVRTPAKNSWIPD 740 

::!:::«: I : : I II I :| I I : I :|: : |: : 
687 evktinstavrlfwkkrkleelidgyyikwrgpprtndnq — yvnvtspstenyvvsn 742 



Db 



Qy 741 LRKGVNYEIKARPF--FNEFQGADSEIKFAKTLEEAPSAPPQGVTVSKNDGNGTAILVS 797 

I 111:1: : II I I I II II: I : I I : :| 

Db 743 lmpftnyeffvipyhsgvhsihgapsnsmdvltaeappslppedvrirml--nlttlris 800 

Qy 798 WQPPPEDTQNGMVQEYKVWCLGNETRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTG 857 

I: I I II::: ::: :| : I I : II : II |: I : III : 
Db 801 wkapkadgingilkgfqivivgqapnnnrnittneraasvtlfhlvtgmtykirvaarsn 860 

Qy 858 AGSGVKSEPQFIQLDAHGNPVSPEDQVSLAQQISDWKQPAFIAGIGAACWI I 910 

I II' , :|| :| :| : :: : :|: |: : I 
Db 861 ggvgv -----shgtsevimnqdtlekhlaaqqenesflyglinkshvpvivivai 910 

Qy 911 LMVFSIWLYRHRKRRNGLTSTYAGIRRVPSFTFTPTVTYQRGGEAVSSGGRPGLLNISE- 969 

I::| : : : II I III : | ::| | :;;; 
Db 911 liifwiiiaycyvrnsrnsd — gkdrsf ikindgsvhmasn---nlwdvaqn 958 

Qy 970 PAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYS — RPADCIANYNNQLDN 1025 

II Ml ' I : :| || || : I 

Db 959 pnqnpmyntagrmtmnnrngqalysltpnaqdffnncddysgtmhrpgsehhyhyaqltg 1018 

Qy 1026 RQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQSNLSN 1085 

I I H II I: :: hlllll I: II 

Db 1019 gpgnam— stfyg nqyhd dpspyatttlvlsn-- 1048 

Qy 1086 NMNNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNRLNKDYRANDTVPPTIPYNQSYDQN 1145 

llll I : III I I : 
Db 1049 qq pawln---dkmlrapamptnpvppepp--aryadh 1080 

Qy 1146 TGGSYNSSDRGSS TSGSQ GHKK — G 1168 

II : I I" I MM I I 

Db 1081 tagrrsrssrasdgrgtlngglhhrtsgsqrsdspphtdvsyvqlhssdgtgsskertge 1140 

Qy 1169 ARTPKVPKQGGMNWADLLPPPPAHPPPHSNSEEYNISVDESYDQEMPCPVPPARMYLQQD 1228 

III I : -I I :MM::IM | M I 

Db 1141 rrtp- -pnktim- - -df ippppsnppp pgghvy---d 1169 

Qy 1229 ELEEEEDERGPTP 1241 

: II If 
Db 1170 tatrrqlnrgstp 1182 



RESULT 7 ■ 
Y08403 

ID Y08403 standard; Protein; 1297 AA, 
XX 

AC Y08403; 

XX i 

DT 24-JUL-1999 (first entry) 

XX 

DE C. elegans ROBO protein, 
xx • 

KW ROBOl; ROB02; roundabout; nerve guidance; human; murine; cell function; 
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KW cell morphology; screening assay. 
XX 

OS Caenorhabditis elegans. 
xx 

PN WO9920764-A1, 
XX 

PD 29-APR-1999. 
XX 

PF 20-OCT-1998; 98WO-US22164. 



14-NOV-1997; 
20-OCT-1997; 



97US-0971172. 
97US-0Q62921. 



(REGC ) ONIV CALIFORNIA. 



PI Goodman CS, Kidd t, Mitchell RJ, Tear G; 



WPI; 1999-312615/26. 
N-PSDB; X57252. 



i 



PT Robo polypeptide&r a new immunoglobulin superfamily member 



Claim 1; Page 59-63; 80pp; English. 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C. elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides. The 
probes and primers are also useful in screening assays. 

Sequence 1297 AA; I 



Query Match 17.2%; Score 1500.5; DB 20; Length 1297; 

Best Local Similarity 32,1%; Pred. No, 7e-71; 

Matches 408; Conservative 166; Mismatches 480; Indels 219; Gaps 36; 

Qy 68 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDRDDPRSHRMLLPSGSL 127 

I hill l::ll:l Mill |: I I III |: I |:|: ll|::| :||| 
Db 30 pviiehpidvwsrgspatlncgak-pstakitwykdgqpvitnkeqvnshrivldtgsl 88 

Qy 128 FFLRI VHGRKSR - PDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEP 186 

I h: I: : I I I III I II I: ll::|:||:|ll I I II 
Db 89 fllkvnsgkngkdsdagayycvasnehgevksnegslklamlredfrvrprtvqalggem 148 

Qy 187 AVMECQPPRGHPEPTISWKKDGSPLDDKD-ERITIRG-GKLMITYTRKSDAGRYVCVGTN 244 
11:11 MM Ml :ll:ll I :| I I: I Ml :ll:l I II I 
149 avlecspprgfpepvvswrkddkelriqdmprytlhsdgnliidpvdrsdsgtyqcvann 208 



245 MVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDGELPK 304 
Mill I I Ml I'-M : I II : II III M h: : = 1 
Db 209 mvgervsnparlsvfekpkfeqepkdmtvdvgaavlfdcrvtgdpqpqitwkrknepmpv 268 

Qy 305 SR-yEIRDDHTLRIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPRDQWA 363 

:| I :|: 1:1 :| I I I I I I I IN I II II I II II I 
Db 269 trayiakdnrglriervqpsdegeyvcyarnpagtleasahlrvqappsfqtkpadqsvp 328 

Qy 364 LGRTVTFQCEATGNPQPAIFWRREGSQNLLF-SYQPPQSSSRFSVSQTGDLTITNVQRSD 422 

I I 11:1 I I II II :ll hill II : I II II III h = I 
Db 329 aggtatfectlvgqpspayfwskegqqdllfpsyvsadgrtkvsptgtltieevrqvd 386 

Qy 423 VGYYICQTLNVAGSIITKAYLE VTDV 448 

I hi :| Ml "II |: II 
Db 387 egayvcagmnsagsslskaalkatfetkgrvqkkkskmgkqkqknvqsiikylisavtgn 446 

Qy 449 IADRPPPVIRQGPVNQTVAVDGTFVLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLENG 508 

Mil I I III: I : :l I hi I I I I :lh : MM I I 
Db 447 tpakppptiehghqnqtlmvgssailpcqasgkptpgiswlrdglpiditdsrisqhstg 506 



Qy 



509 VLQIRYAKLGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPE 568 
II I ill Mill hill! : h: II lh lhh:| 



Db 507 slhiadlkkpdtgvytciaknedgestwsasltvedhtsnaqfvrmpdpsnfpssptqpi 566 

Qy 569 VTDVSRNTVTLSWQPNLNSGATP-TSYIIEAFSHASGSSWQTVAENVKTETSAIKGLKPN 627 

::!:l!l Ill I I III: M I :| : : I : lllllh 
Db 567 ivnvtdtevelhwnapstsgagpitgyiiqyyspdlgqtwfnipdyvasteyrikglkps 626 

Qy . 628 AI YLFLVRAANAYGI SDPSQI SDPVKT QDVLPTSQGVDHKQVQREL-GNAVLHLH 681 

| hh:|l i II II I I I II M :: I :: I 

Db i 627 hsymfviraenekgigtpsvssalvttskpaaqvalsdknkmdmaiaekrltseqlikle 686 

Qy 682 NPTVLSSSSIEVHWTVDQQSQYIQGYKILYR-PSGANHGESDWLVFEVRTPAKNSWIPD 740 
::|:::': I : : I II I :| I I : I :|: : h : 

687 evktinstavrlfwkkrkleelidgyyikwrgpprtndnq — yvnvtspstenyvvsn 742 

741 LRKGVNYEIKARPF- ■ -FNEFQGADSEIKFAKTLEEAPSAPPQGVTVSKNDGNGTAILVS 797 

I III ' h : II I I I II lh I : II: :| 

Db. 743 linpftnyeffvipyhsgvhsihgapsnsmdvltaeappslppedvrirml--nlttlris 800 

Qy 798 WQPPPEDTQNGMVQEYKWCLGNETRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTG 857 

I: I I lh:: ::: :| : I I : II : II h I : III : 
Db 801 wkapkadgingilkgfqivivgqapnnnrnittneraasvtlfhlvtgmtykirvaarsn 860 

Qy 858 AGSGVKSEPQFIQLDAHGNPVSPEDQVSLAQQISDWKQPAFIAGIGAACWI I 910 

I II Ml :l :l : :: : =h h : I 

Db 861 ggvgv , shgtsevimnqdtlekhlaaqqenesflyglinkshvpvivivai 910 

Qy 911 LMVFSIWLYRHRRKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGGRPGLLNISE- 969 

i h:| : : : II I I II : I ::| I :::: 
Db 911 liifwiiiaycywrnsrnsd— -gkdrsf ikindgsvhmasn- - -nlwdvaqn 958 

Qy 970 PAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYS — RPADCIANYNNQLDN 1025 

II I II I :: I : :| II II : II 

Db 959 pnqnpmyntagrmtmnnrngqalysltpnaqdffnncddysgtmhrpgsehhyhyaqltg 1018 

Qy 1026 RQTNLMLPESTVYGDVDLSNRINEMRTFNSPNLRDGRFVNPSGQPTPYATTQLIQSNLSN 1085 

I I ' Ml h :: . hlllll h II 

Db 1019 gpgnam---stfyg nqyhd -. dpspyatttlvlsn— 1048 

Qy 1086 NMNNGSGDSGEKHWRPLGQQKQEVAPVQYNIVEQNRLNRDYRANDTVPPTIPYNQSYDQN 1145 

II I I I : III I I : 
Db 1049 qq pawln---dkmlrapamptnpvppepp--aryadh 1080 

Qy 1146 TGGSYHSSDRGSS TSGSQ GHKR — G 1168 

II : I I Mill I I 

Db 1081 tagrrsrssrasdgrgtlngglhhrtsgsqrsdspphtdvsyvqlhssdgtgsskertge 1140 

Qy 1169 ARTPKVPRQGGMNWADLLPPPPAHPPPHSNSEEYNISVDESYDQEMPCPVPPARMYLQQD 1228 

III I : "I I :|llh:|ll I = 1 I 

Db 1141 rrtp-pnktlm— dfippppsnppp pgghvy---d 1169 

Qy 1229 ELEEEEDERGPTP 1241 

: IMI 
Db 1170 tatrrqlnrgstp 1182 



Y08402 

ID Y08402 standard; Protein; 1380 AA. 
XX 

AC Y08402; ' 
XX 

DT 24-JUL-1999 (first entry) 
XX 

DE Drosophila sp. ROB02 extracellular domain protein, 
XX 

KW ROBOl; ROB02; roundabout; nerve guidance; human; murine; cell function; 

KW cell morphology; screening assay. 

XX 

OS Drosophila sp, ' 
XX 

PN WO9920764-A1. ; 
XX 

PD 29-APR-1999. 



Best Available Copy 

Mon Jan 22 13:04:37 2001 



us-09-540-245a-18.rag 



Page 9 



xx 

PF 20-OCT-1998; 98WO-US22164. 
XX 

PR 14-NOV-1997; 97US-0971172. 

PR 20-OCT-1997; 97US-0062921. 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Goodman CS, Ridd T, Mitchell KJ, Tear G; 

DR WPI; 1999-312615/26. 

DR N-PSDB; X57251. 
XX 

PT Robo polypeptides, a new immunoglobulin superfamily member 
XX 

PS Claim 1; Page 52-56; 80pp; English. 
XX 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C. elegans, human and murine samples. The products of the invention can 

CC be used to raise anti-Robo antibodies, which can be used to modulate cell 

CC function or morphology, The Robo polynucleotides and fragments are useful 

CC as probes and primers and for production of the Robo polypeptides. The 

CC probes and primers are also useful in screening assays, 
xx 

SQ Sequence 1380 AA; 



Query Match 17.2%; Score 1498.5; DB 20; Length 1380; 

Best Local Similarity 29.0%; Pred. No. 9.6e-71; 

Matches 410; Conservative 221; Mismatches 526; Indels 259; Gaps 42; 

Qy 68 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 127 

111:111 I I I :| I 11:111 |||||:|:| I ::|| |||::M:| I 
Db 4 priiehpmdttvpkndpftfncqaegnptptiqwfkdgrelktdtg---shrimlpaggl 60 

Qy 128 FFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 187 

M::: h II I I |:| I I I 1 1 : 1 : 1 1 : 1 1 1 : 1 1 |:: || || I 
Db 61 fflkvihsrr-esdagtywceaknefgvarsrnatlqvavlrdefrlepantrvaqgeva 119 

Qy 188 VMECQPPRGHPEPTISWKKDGSPLD— DKDERIT IRGGKLMIT YTRKSDAGKYVCVGTN 244 

:H III IN |||:|:| |: :| I : II I I |:|| |:| || 
Db 120 Imecgaprgspepqiswrkngqtlnlvgnkriri-vdggnlaiqearqsddgryqcwkn 178 

Qy 245 MVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRM--DDGEL 302 




179 vvgtresataflkvhvrpflirgpqnqtawgsswfqcriggdplpdvlwrrtasggnm 238 



Qy 303 P KSRYEIRDDHTLKIRKWAGDMGSYTCVAENMVGKAEASATLTVQEPPH 352 

I I : : :M: 1 III III |:| II |: III II 

Db 239 plrkfswlhsasgrvhvledrslklddvtledmgeytceadnavggitatgiltvhappk 298 

Qy 353 FWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGD 412 

l|::|::|:| :| I hhl hhl ::| II: :|| I I |: I : 
Db 299 fvirpknqlveigdevlfecqanghprptlywsvegnsslll---pgyrdgrmevtltpe 355 

Qy 413 — LTITNVQRSDVGYYI-CQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVA 467 

, hi I I I : I Nil: :: : I I : lll:| lllllll: 

Db 356 grsvlsiarfaredsgkwtcnalnavgsvssrtwsvdtqfelpppiieqgpvnqtlp 414 
<r> 

Qy 468 VDGTFVLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLEN-GVLQIR-YAKLGDTGRYTC 525 

I II I hill : I lh : h : I : I I I : I I III 
Db 415 vksivvlpcrtlgtpvpqvswyldgipidvqeherrnlsdagaltisdlqrhedeglytc 474 

Qy 526 IASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNL IPSAPSKPEVTDVSRN 575 

:M :|:::ll I: : Ihlh I I lh: : I 

Db 475 vasnrngksswsgylrld tptnpnik f f rapels typgppgkpqmvekgen 525 

Qy 576 TVTLSW-QPNLNSGATPTSYIIEAFSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLV 634 

:IHII : I h: hll I I I I: I II I I lh 
Db 526 svtlswtrsnkvggsslvgyviemfgknetdgwvavgtrvqnttftqtgllpgvnyffli 585 

Qy 635 RAAHAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQRE-LGNAVLHLHNPTVLSSSSIEV 693 



II h:l':l II :|:|: I : hi : : I h I I :|: hh:: 
Db 586 raenshglslpspmsepi-tvgtryfnsgldlsearasllsgdvvelsnasvvdstsmkl 644 

Qy 694 HWTVDQQSQYIQGYKILYR 712 

I : :h:h : I 

Db 645 twqi-ingkyvegfyvyarqlpnpivnnpapvtsntnpllgststsasasasasalistk 703 

Qy 713 — PSGANHGESDWLVFEVRTP AKNSWIPDLRKGVNYEIKARPF 754 

: lh: II :| I I : II II 

Db 704 pniaaagkrdgetnqsgggaptplntkyrmltilngggassctitglvqytlyeffivpf 763 

Qy 755 FNEFQGADSEIKFAKTLEEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYK 814 

: :l I :: hllh II I h I :|: : h I ::|:: I 
Db 764 yksvegkpsnsriartledvpseapygmealll-nssavflkwkapelkdrhgvllnyh 821 

Qy 815 VWCLGNETRYHI NKTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSEPQFI 869 

I I :| :: I hi :: ::|: I h hi III III I : 
Db 822 vivrgidtahnfsriltnvtidaasptlvlanltegvmytvgvaagnnagvgpycvpatl 881 

Qy 870 QLDAHGNPVSPEDQVSLAQQI SDWKQPAF IAG IGAACWI ILMVFS I WLY - - -RHRKKRN 926 

:|| : I :: ::||: II II :|| :::: I :: :| : 
Db 882 rldpitkrldp--finqrdhvndvltqpwfiillgailavlmlsfgamvfvkrkhnimikq 93S 

Qy 927 GLTSTYAG- -IRKVPSFTFTPTVTYQRGGEAV — SSGG" -RPGLLNISEPAAQ? 974 

:| I •' : hll : I I hll II I : 

Db 940 salntmrgnhtsdvlkmpsls arngngywldsstggmvwrpspggdslemqkd 992 

Qy 975 WLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNKQTNLMLPE 1034 

:M I I. : : hi : I : I I : hi : 
Db 993 hiadyapvcgapgspagggtssggsggagsg — asggddihgghgsernqqryv — 1044 

Qy 1035 STVYGDV-DLSHKIKEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQSNLSNNMNN--- 1089 

I: :: h :| Mill MM: :: : 
Db 1045 ■■--geysniptdyaevssfgkapseygrhgnas-papyatssilsphqqqqqqqpryq 1098 

Qy 1090 GSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYNQS— 1141 

I I ■' I I Ihl I ::::::: :||: I I 
Db 1099 qrpvpgyglqfpmh-phyqqqqh— qqqqaqqthqqhqalqqhqqlppsniyqqmstt 1153 

Qy 1142 — YDQNTGGS YNSSD RGSSTSGSQG — 1164 

I III III : II II 

Db 1154 seiyptntgpsrsvyseqyyypkdkqrhihienklsnchtyeaapgakqsspissqfasv 1213 

Qy 1165 -HKKGARTPKVPKQGGMNWADLL PPPPAHP 1193 

:: II : II :|| III 
Db 1214 rrqqlppncsigresarfkvlntdqgknqqnlldldgssmcyngladsgcggspspmaml 1273 

Qy 1194 PPHSNSEEYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPTPPVRGAASSPAAV 1253 

I : .: I I hh: II : : :: I I : 

Db 1274 mshedehalyhtadgdldd merl yvk vdeqqppqqqqql iplvpqhpaeghlq 1326 

Qy 1254 SYSHQSTATL'TPSPQEELQPMLQDCPEETGHMQHQP 1289 

h : 1 1 1 : ■ : 1 1 I : I : : I 

Db 1327 swrnqstrssrkngqe cikepseliyap 1354 



RESULT 9 
Y13564 

ID Y13564 standard; Protein; 1381 AA. 
XX 

AC Y13564; 
XX 

DT 30-JO1-1999 (first entry) 
XX 

DE Drosophila Robo'' 2 polypeptde. 
XX 

KW Comi polypeptide; Robo polypeptide; commissureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Drosophila sp, ■ 

XX 

PN K09925833-A1. ' 
XX 
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1 

CC 

cc 

CC 
XX 



13- NOV-1998; 98WO-CS24327. 

14- NOV-1997; 97US -0065543 . 
(RE6C ) UNIV CALIFORNIA. 

Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 

WPI; 1999-338008/28. 
N-PSDB; X55768. 

Modulation of Robo-Comm polypeptide interactions 
Disclosure; Page 34-38; 56pp; English, 

The invention relates to a method for modulating the amount of Coram 
(commissureless) polypeptide in contact with a cell expressing active 
Robo (roundabout) on its surface. The method comprises modulating the 
effective amount of Coram polypeptide in contact with the cell, where the 
amount of expressed active Robo is specifically modulated inversely with 
the modulation of the effective amount of Comm in contact with the cell. 
The method is used to modulate the amount of active Robo expressed on a 
cell. The method can be used to screen for agents that modulate Robo: Comm 
interactions. This is particularly useful for modulating nerve cell 
function , 

Sequence 1381 AA; 



QueryjMatch 17.2%; Score 1498; DB 20; Length 1381; 

Best Local Similarity 28.9%; Pred. No. le-70; 

Matches 410; Conservative 221; Mismatches 526; Indels 260; Gaps 42; 

Oy 68 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 127 

llhlll I I I :!! hill 1 1 1 1 1 : 1 : 1 I ::|| I 
Db 4 priiehpradttvpJcndpf tf ncqaegnptptiqwf kdgrelktdtg- - -shrimlpaggl 60 

Qy 128 PFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 187 

M!::: |: I I I hi I I I 1 1 : 1 : 1 1 : II I : I ! |:: I II I 
Db 61 fflkvihsrr-esdagtywceaknefgvarsrnatlqvavlrdefrlepantrvaqgeva 119 

Qy 188 VMECQPPRGHPEPTISWKKDGSPLD— DKDERIT IRGGKLMIT YTRKSDAGKYVCVGTN 244 

:|ll III III III:!:! |: :| II : II I I Ml hi II I 
Db 120 lmecgaprgspepqiswrkngqtlnlvgnkriri-vdggnlaiqearqsddgryqcwkn 178 

Qy 245 MVGERESEVAELTVLERPSrVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRK-DDGEL 302 
:H III I I I II :: I I II hi III: I ||: I : 

(b 179 vvgtresataflkvhvrpflirgpqnqtawgsswfqcriggdplpdvlwrrtasggnm 238 
y 303 P KSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPH 352 

I I : : :||: II III III hi II h III II 

Db 239 plrkfswlhsasgrvhvledrslklddvtledmgeytceadnavggitatgiltvhappk 298 

Qy 353 FWKPRDQVVALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGD 412 

!l::|::!:l :| I hhl hhl ;:| lh :ll I I h I : 
Db 299 fvirpknqlveigdevlfecqanghprptlywsvegnsslll---pgyrdgrmevtltpe 355 

Qy 413 — LTITNVQRSDVGni-CQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVA 467 

hi I I I : I II lh :: : I I : llhl llllllh 
Db 356 grsvlsiarfaredsgkwtcnalnavgsvssrtwsv-dtqfelpppiieqgpvnqtlp 414 

Qy 468 VDGTFVLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLEN-GVLQIR-YAKLGDTGRYTC 525 

I II I T:lll : I II:: I : : I : I I I Mill 
Db 415 vksivvlpcrtlgtpvpqvswyldgipidvqeherrnlsdagaltisdlqrhedeglytc 474 

Qy 526 IASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNL IPSAPSKPEVTDVSRN 575 

:|| :|:::H h : Ihlh I I lh: : I 

Db 475 vasnrngksswsgylrld tptnpnikffrapelstypgppgkpqmvekgen 525 

Qy 576 TVTLSW-QPNLNSGATPTSYIIEAFSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLV 634 

: 1 1 1 1 i : I h: |:|| I llhl II I I lh 



Db 526 svtlswtrsnkvggsslvgyviemfgknetdgwvavgtrvqnttftqtgllpgvnyffli 585 

Qy 635 RAANAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQRE-LGNAVLHLHNPTVLSSSSIEV 693 

II I I : I II :hh I : 1:1 : : I h II :|: hh:: 
Db 586 raenshglslpspmsepi-tvgtryfnsgldlsea'rasllsgdwelsnasvvdstsmkl 644 

Qy 694 HWTVDQQSQYIQGYKILYR 712 

I : :h:h : I 

Db 645 twqi-ingkyvegfyvyarqlpnplvnnpapvtsntnpllgststsasasasasalistk 703 

Qy 713 — PSGANHGESDWLVFEVRTP AKNSWIPDLRKGVNYEIRARPF 754 

:| ih: II :| I I : II II 

Db 704 pniaaagkrdgetnqsgggaptplntkyrmltilngggassctitglvqytlyeffivpf 763 

Qy 755 FNEFQGADSEIKFAKTLEEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYK 814 

: :l I': hill: II I h I :|: : h I ::|:: I 
Db 764 yksvegkpsnsriartledvpseapygmealll--nssavflkwkapelkdrhgvllnyh 821 

Qy 815 VWCLGNETRYHI NKTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSEPQFI 869 

I I :| :: I hi :: ::h I h hi III III I : 
Db 822 vivrgidtahhfsriltnvtidaasptlvlanltegvmytvgvaagnnagvgpycvpatl 881 

Qy 870 QLDAHGNPVSPEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIWLY— RHRKKRN 926 

:|| : '! :: ::lh II II :|| :::: I :: :| : 
Db 882 rldpitkrldp--finqrdhvndvltqpwfiillgailavlmlsfgamvfvkrkhmiMkq 939 

Qy 927 GLTSTYAG— "IRKVPSFTFTPTVTYQRGGEAV— -SSGG— RPGLLNISEPAAQP 974 

:| |. : hh' : I I |:|| II I : 

Db 940 salntmrgnhtsdvlkmpsls arngngywldsstggmvwrpspggdslemqkd 992 

Qy 975 WLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNKQTNLMLPE 1034 

:|| I I. : : hi : I : I I : hi : 
Db 993 hiadyapvcgapgspagggtssggsggagsg — asggddihgghgsernqqryv — 1044 

Qy 1035 STVYGDV-DLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQSNLSNNMNN — 1089 

I: ::' h :| =11111 lllh :: : 
Db 1045 ----geysniptdyaevssfgkapseygrhgnas--papyatssilsphqqqqqqqpryq 1098 

Qy 1090 GSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYNQS— 1141 

I I I I 11:1 I ::::::: :lh I I 
Db 1099 qrpvpgyglqrpmh- -phyqqqqh-- -qqqqaqqthqqhqalqqhqqlppsniyqqmstt 1153 

Qy 1142 — YDQNTGGS YNSSD RGSSTSGSQG-- 1164 

I III! II : II II 

Db 1154 seiyptntgpsrsvyseqyyypkdkqrhihitenklsnchtyeaapgakqsspissqfas 1213 

Qy 1165 HKKGARTPKVPKQGGMNWADLL PPPPAH 1192 

■ :: II : I I :| III 
Db 1214 vrrqqlppncslgresarfkvlntdqgknqqnlldldgssmcyngladsgcggspspmam 1273 

Qy 1193 PPPHSNSEEYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPTPPVRGAASSPAA 1252 

I : ,:' : I I hh: II : : :: I I : 

Db 1274 lms hedeha ly htadgd 1 dd merlyvkvdeqqppqqqqqliplvpqhpaeghl 1326 

Qy 1253 VSYSHQSTATLTPSPQEELQPMLQDCPEETGHMQHQP 1289 

I: :MI :' : II I :| : : I 

Db 1327 qswrnqstrssrkngqe cikepseliyap 1355 



RESULT 10 
Y13567 

ID Y13567 standard; Protein; 434 AA. 
XX 

AC Y13567; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE Human Robo 2 polypeptde. 
XX 

KW Comm polypeptide; Robo polypeptide; commissureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Homo sapiens. 
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Key Location/Qualifiers 
Misc-difference 285 

/label* unknown 

/note- "encoded by GTN" 
Misc-difference 396 

/label- unknown 

/note- "encoded by NTT" 

W09925833-A1. 
27-MAY-1999. 

13- NOV-1998; 98WO-US24327. 

14- NOV-1997; 97OS-Q065543. 
(REGC ) DNIV CALIFORNIA. 

Goodman c, Kid T, Mitchell KJ, Russell C, Tear G; 

WPI; 1999-338008/28. 
N-PSDB; X55771. 

Modulation of Robo-Comm polypeptide interactions 
i 

Disclosure; Page ,49 -50; 56pp; English. 

The invention relates to a method for modulating the amount of Comm 
(commissureless) polypeptide in contact with a cell expressing active 
Robo (roundabout) on its surface. The method comprises modulating the 
effective amount of Comm polypeptide in contact with the cell, where the 
amount of expressed active Robo is specifically modulated inversely with 
the modulation of the effective amount of Comm in contact with the cell. 
The method is used to modulate the amount of active Robo expressed on a 
cell. The method can be used to screen for agents that modulate Robo: Comm 
interactions. This is particularly useful for modulating nerve cell 
function . 

Sequence 434 AA; 



.Query Match 10.5%; Score 913; DB 20; Length 434; 

Best Local Similarity 23.21;; Pred. No. 1.2e-40; 
Matches 258; Conservative 54; Mismatches 118; Indels 684; Gaps 12; 

360 QWALGRTVTFQCEATGNPQPAkFWRREGSONLLFSYQPPQSSSRFSVSQTGDLTITNVQ 419 



:M MINI II Mill 



ll::||IHIII II I :ll III 1 1 i I f J 1 1 : 1 



Db 1 qivaqgrtvtfpcetkgnpqparfwqkegsqnllfpnqpqqpnsrcsvsptgdltitniq 60 

Oy 420 RSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVAT 479 

III llllll I HIM: II HUM: 11111:1 III llhlllll :l I II 
Db 61 rsdagyyicqaltvagsilakaqlevtdvltdrpppiilqgpanqtlavdgtallkckat 120 



Db 



4 80 GSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSAY 539 

I 1:1 II hi :l I II III: :: III llhl:: 1111:111 
121 gdplpviswlkegftfpgrdpratiqeqgtlqiknlrisdtgtytcvatsssgeaswsav 180 



Qy 540 IEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEAF 599 

"•III : : I : :| Mil :|i II II ::|||||l 

Db 181 ldvtesgatis-knydlsdlpgppskpqvtdvtknsvtlswqpgtpgtlpasayiieaf 238 

Qy 600 SHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPV1TQDVLP 659 

i : mini mi mim mimi i 

Db 239 sqsvsnswqtvanhvkttlytvrglrpntiylfmvrain 277 

Qy 660 TSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHG 719 
Db 278 277 



Qy 720 ESDWLVFEVRTPAKNSWIPD ,RKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPP 779 

I 

Db 278 pk 279 



Qy 780 QGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHINKTVDGSTFSWI 839 

II I 

Db 280 vsvtqxk----;- 286 

Qy 840 PFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVSLAQQISDWKQPAF 899 

II 

Db 287 pq 288 

Qy 900 IAGIGAACWIILMVFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSG 959 

I II 

Db 289 knng 292 

Qy 960 GRPGLLNISEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANY 1019 
II I 

Db 293 ' stwan 297 

Qy 1020 NNQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLI 1079 

Db 298 - -. 297 

Qy 1080 QSNLSNNMNNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYN 1139 

Db 298 297 

Qy 1140 QSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKVPKQGGMNWADLLPPPPAHPPPHSNS 1199 

II Mill I I : 

Db 298 vp lppppvqplpgtel 313 

Qy 1200 EEYNISVDES -YDQEMPCPVPPARMYLQQ- • -DELEEEEDERGPTPPVRGAASSPAAVSY 1255 

II: I: II : II I : II I III ll:|:| Hill Mil |:|: 
Db 314 ehyaveqqengydsdswcpplpvqtylhqgledel-eedddrvptppvrgvassp-aisf 371 

Qy 1256 SHQSTATLTPSPQEELQPMLQDCPEETGHMQHQPDRRRQPVSPPPPPRPISPPHTYGYIS 1315 

1 1 1 1 1 1 1 1 1 1 : 1 1 : 1 1 1 1 1 I 
Db 372 gqqstatltpspreemqpmlqasp 395 



1316 GPLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWG 1375 
396 395 



Qy 1376 SASEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHAS 1435 

I :| 

Db 396 ; xftss 400 

Qy 1436 QCPRPTSPVSTDSNMSAAVMQKTRPAKKLKHQPG 1469 

i mm inn im i n m i 

Db 401 qrprptspfstdsntsaalsqsqrprptkkhkgg 434 



RESULT 11 
Y08405 

ID Y08405 standard;' Protein; 434 AA. 
XX 

AC Y08405; 
XX 
DT 
XX 



24-JUL-1999 (first entry) 



DE Human partial ROB02 protein, 
XX 

KW R0B01; ROB02; roundabout; nerve guidance; human; murine; cell function; 

KW cell morphology," screening assay. 

XX 



XX 

PN WO9920764-A1, 
XX 

PD 29-APR-1999. 
XX 

PF 20-OCM998; 



XX 

PR 14-NOV-1997; 97US-0971172, 
PR 20-OCM997; 9703-0062921. 
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XX 
PA 
XX 

PI Goodman CS, Kidd T, Mitchell KJ, Tear G; 
XX 
DR 
DR 
XX 
PT 
XX 
PS 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 

XX 

so 



(REGC ) ONIV CALIFORNIA. 



WPI; 1999-312615/26. 
N-PSDB; X57254. 

Robo polypeptides, a new immunoglobulin superfamily member 
Claim 1; Page 72-73; 80pp; English, 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C. elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies , which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides. The 
probes and primers are also useful in screening assays. 

Sequence 434 AA; 



m Query Match , 10.5%; Score 913; DB 20; Length 434; 

Best Local Similarity 23.24; Pred. No. 1.2e-40; 
Matches 258; Conservative 54; Mismatches 118; Indels 684; Gaps 12; 

Qy 360 QWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQ 419 

1:11 HUH II lllllhll::llllllll II I :|| III lllllllhl 
Db ^ 1 qivaqgrtvtfpcettgnpqpavfwqkegsqnllfpnqpqqpnsrcsvsptgdltitniq 60 

Qy 420 RSDVGYnCQTLlWAGSIITKArLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVAT 479 

iii linn i mil: n mill: iiiim hi iimiiii =i i ii 

Db 61 rsdagyyicqaltvagsilakaqlevtdvltdrpppiilqgpanqtlavdgtallkckat 120 

Qy 480 GSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSAY 539 

I 1:1 I I hi :|| II III: :: III I i I : I : : MINI 
Db 121 gdplpviswlkegftfpgrdpratiqeqgtlqiknlrisdtgtytcvatsssgeaswsav 180 

Qy 540 IEVQEFGVPVQPPRPTDPNLIPSAPSRPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEAF 599 

"III : : I : :| llll:|lll::|:lllllll ::|||||| 
Db 181 ldvtesgatis--knydlsdlpgppskpqvtdvtknsvtlswqpgtpgtlpasayiieaf 238 

Qy 600 SHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLP 659 

I : :llllll :lll ::lhll lllhlll I 
Db 239 sqsvsnswqtvanhvkttlytvrglrpntiylfmvrain 277 

Qy 660 TSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHG 719 

278 277 



f 



720 ESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPP 779 

I 

Db 278 pk 279 

Qy 780 QGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHINXTVDGSTFSWI 839 

II I 

Db 280 vsvtqxk 286 



840 PFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVSLAQQISDWKQPAF 899 
II 

287 pq • 288 



Qy 900 IAGIGAACWIILMVFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSG 959 
I II 

Db 289 - knng 292 

Qy 960 GRPGLLNISEPAAQPWLADTPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANY 1019 
II I 

Db 293 stwan 297 

Qy 1020 NNQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLI 1079 



Db 298 : 297 

Qy 1080 QSNLSNNMNNGSGDSGERHWKPLGQQKQEVAPVQYNIVEQNKLNRDYRANDTVPPTIPYN 1139 

Db 298 297 

Qy 1140 QSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKVPKQGGMNWADLLPPPPAHPPPHSNS 1199 

II Mill I I : 

Db 298 • vp lppppvqplpgtel 313 

Qy 1200 EEYNISVDES-YDQEMPCPVPPARMYLQQ— DELEEEEDERGPTPPVRGAASSPAAVSY 1255 

I I : I:, II : II I : II I III 1 1 : 1 : 1 1 1 1 1 1 1 1 III |:|: 
Db 314 ehyaveqqengydsdswcpplpvqtylhqgledel-eedddrvptppvrgvassp-aisf 371 

Qy 1256 SHQSTATLTPSPQEELQPMLQDCPEETGHMQHQPDRRRQPVSPPPPPRPISPPHTYGYIS 1315 

lllllllllhlhlllll I 
Db 372 gqqstatltpspreemqpmlqasp 395 

Qy 1316 GPLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWG 1375 

Db 396 : 395 

Qy 1376 SASEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHAS 1435 

I :| 

Db 396 xftss 400 

Qy 1436 QCPRPTSPVSTDSNMSAAVMQKTRPAKKLKHQPG 1469 

I Hill Hill III: I || ||; I 
Db 401 qrprptspfstdsntsaalsqsqrprptkkhkgg 434 



RESULT 12 
Y41716 

ID Y41716 standard; Protein; 985 AA. 
XX 

AC Y41716; 
XX 

DT 07-DEC-1999 (first entry) 



Human PRO860 protein sequence. 

Human; PRO; EST; expressed sequence tag; PCR primer; hybridisation; 
probe; blood coagulation disorder; cancer; cellular adhesion disorder; 
secreted protein; transmembrane protein. 



XX 

PN W09946281-A2. 
XX 

PD 16-SEP-1999. 
XX 

PF 08-MAR-1999; 
XX 

PR 10-MAR-1998; 

PR ll-MAR-1998; 

PR ll-MAR-1998; 

PR ll-MAR-1998; 

PR 12-MAR-1998; 

PR 13-MAR-1998; 

PR 17-MAR-1998; 

PR 20-MAR-1998; 

PR 20-MAR-1998; ■ 

PR 20-MAR-1998; 

PR 20-MAR-1998; 

PR 25-MAR-1998; 

PR 26-MAR-1998; 

PR 27-MAR-1998; 

PR 27-MAR-1998; 

PR 27-MAR-1998; 

PR 27-MAR-1998; 

PR 27-MAR-1998; 

PR 30-MAR-1998; 

PR 30-MAR-1998; 



0077450 
0077632 
0077641 
0077649 
0077791 
0078004 
0040220 



0078910 
0078936 
0078939 
0079294 
0079656 
0079663 
0079664 
0079689 
0079728 
0079786 
0079920, 
0079923 
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PR 




MAD*1 QQQ 




UUoUlUD , 




r\ 


MAD.1 QQQ 

mak iyy b 


980S 


UUtJUlU/ . 


PR 




MSD.1 QQQ 
MM 1330 




nnflniKS 

UUoUloD . 


DD 




MAD-1 QQQ 
MM 1 3 30 


none 

9 80S 


UUBUI34 , 




ni 


ADD-1 QQQ 




Vvavitl , 




01 


ADD. 1 QQQ 

apk iy y 0 


9 80S 


nnonno 


PR 


01 


ADD. 1 QQQ 


9 80S 


UUBUioo , 


PR 


01 


APR-1998 


980S 


o.ooaooji 


PR 


08 


ADD.1 QQO 


9 80S 


nnoi r\t o 


PR 


n! 


ADD-1 QQQ 

apk lyyo 


980S 


UUt)iU/U, 




08 


APR" 1998 


980S 


UUblU/1 . 


PR 


09 


ADD.1QQQ 
nrn 1330 




UU0113J ■ 


p 


09 


apr-iqqp, 

nrl\ 1330 


9 80S 


flfiRi5m 

UUolluJ . 


PR 


09 


ADD. 1 QQQ 
hrK 1330 




nnm t)q 


PR 


15 


APR-1QQR 

ftrl\ 1330 


9 8 OS 


r»nfli fti 7 

UUOlOl/ . 


PR 


15 


APR-1998 


9 80S 


VvOlOJO 1 


PR 


15 


APR-1998 


9 80S 


VU013JZ . 




15 


APR-IQQP. 

ttrfl 1330 


980S 


UU013JJ • 


1 


21 


APR-1998 


980S 


UUO* Jug • 


1 


21 


APR-1998 


980S 


UUOfi JU3 . 




22 


ftPK 1330 




UUP* /UU • 


PR 


22 


APR-1998 


Qfinq 






22 


&PR-1QQP, 
nrn 1330 


Qftnc 

ll I 


(lflfi?flfU 


PR 


23 


APR-1998 


98US 


UUo27o7 . 


PR 




ADD-1 QQQ 

APK I33O 


980S 


nnonoe 
UUoz/yb. 


PR 


27 


apk ±y y 0 


980S 


UUoiJJb. 


PR 


28 


APR-1998 


980S 


0083322. 


PR 


29 


APR-1998 


9 80S 


flooT 001 


PR 


29 


ADD.1 OOO 

APK-J.330 


980S 


onoiiioe 


PR 


29 


ADD.1 OOQ 


9 80S 


nnoiaoe 
UU0J43D, 


PR 


29 


APR-1998 


9 80S 


OOOO jI OO 






ADD-1 QQQ 
APK 1330 




noQisno 


DP 


29 


ADD.1 QQQ 
nrn 197B 


Qflnc 


VUOJ345. 




29 


APR-1 QQQ 
ftrK 1330 


Qflnc 


UUOJ534 . 


PR 


29 


HDD- 1 QQO 
ftrK 1330 


9 80S 


nnRWR 

UUOJJJQ , 


PR 


29 


APR-IQQP. 

nri\ 1330 


9 80S 


UVOJ JJ3 . 


PR 


30 


APR" 1998 


980S 




PR 


05 


MAV-1 QQR 

Mnl 1330 


9 80S 


UU04 JQO . 


PR 


06 


MAY - 1 9 9 8 


980S 




PR 


06 


MAY-1QQR 
Mnl 1330 


980S 


(10RiA41 




07 


MAV-1 QQR 

Mnl 1330 




(IflRARQR 


PR 


07 


-MAV-IQQfi 
Mnl 1330 


Qflrrc 


(IflflAfiflfl 
UUOIOUu . 


DP 


07 


Mlv-1 QQO 
MAI 1330 


Qflnc 




DP 


07 


MAV-1QQR 

max I330 


Qflnc 


UUQ4QJ/ , 


PR 


07 


MAV-IQQfl 
MAI 1330 


3 80S 


flflfl/filQ 






MAV-1 QQO 
MAI 1330 


980S 


UUB404U . 


1 


07 


MAV- 1 QQQ 
MAI 1330 


980S 






13 


MAV. 1 QQO 
1330 




flftOKOTO 




13 


MAV-1 QQR 
MAI 1330 


Qflns 


flnfl^iifl 
UuojJoo . 


DP 


13 


MAV-1QQR 
Mnl 1330 


Qflnc 


UU0JJJ3 , 


PR 


15 


MAV-1 QQR 
Mnl 1330 


QflrK 


UuuJJ/ j . 


PR 


15 


MAV-1 QQQ. 
Mnl 1330 


QRH5 


flftOCCTQ 

UUOjj/3 . 




15 


MAV-1 QQQ 
Mnl 1330 


Qflnc 


UUOJJOU . 






MAV-1 QQO 
MAI 1330 


Qflnc 


UUoDjoz . 


DD 




MAV-1 QQQ 
MAI I33B 


980S 


nnoKcoo 
UU0DDD3. 


P 


} 


MAV- 1 QQO 

MAI 1330 


980S 


uuoDoy / . 


PR 


15 


MAY- 1998 


980S 


UUoD/UO. 






MAV.1 QQQ 
MAI-1330 


980S 


UU03/u4 . 


OP 


18 


MAV. 1 QQO 
MAI-1330 


980S 


nnocmi 


DP 




MAV.1 QQQ 

MAI I330 


9 80S 


flflflKlQ') 


PR 


22 


MAV-1QQR 
Mnl 1330 




AfiflKAIA 
UU0Q114 . 


PR 


22 


MAV-IQQfl 
Mnl 1330 


980S 


0086430. 


PR 


22 


MAV-1 QQO 
Mnl 1330 


980S 


0086486. 


PR 


28 


MAY-1998 


980S 


0087098. 


PR 


28 


MAY-1998 


980S 


0087106. 


PR 


28 


MAY-1998 


980S 


0087208. 


PR 


30 


JUL-1998 


980S 


0094651. 


PR 


11 


SEP-1998 


980S 


0100038. 


XX 










PA 




TTH ) GEN 


]NTECH INC. 


XX 






f 




PI 


Wood Wi, Goddard A 


Gurney 


XX 











WPI; 1999-551358/46. 
N-PSDB; Z34069. • 

New secreted and transmembrane polypeptides and their polynucleotides, 
useful for treating blood coagulation disorders, cancers and cellular 
adhesion disorders - 

Claim 12; Pig 77,; 530pp; English, 

The present invention describes secreted and transmembrane polypeptides 
and their polynucleotides. The nucleotide sequences are useful as 
sources of probes, primers, for chromosome mapping, and for generation 
of antisense sequences. They can also be used to create transgenic 
animals. The proteins can be used to treat a variety of diseases and 
disorders, depending on their function. Diseases that may be treated 
include blood coagulation disorders, cancers and cellular adhesion 
disorders, They may also be used to raise antibodies. Z33891 to 
Z34338, and Y41685 to Y41774 represent polynucleotide and polypeptide 
sequence given in the exemplification of the present invention. 



Query Match 9.8%; Score 856; DB 20; Length 985; 

Best Local Similarity 23,3%; Pred. No. 3.3e-37; 

Matches 350; Conservative 144; Mismatches 420; Indels 590; Gaps 55; 



:| 11:1: III: II ::|:l hi III I \- ■ II III 
qdsppqilvhpqdqlfqgpgparmscqasgqppptirwllngqplsmvppdphh---llp 62 



oy 


64 


Db 


6 


Qy 


124 


Db 


63 


Qy 


178 


Db 


123 


Qy 


238 


Db 


183 


Qy 


298 


Db 


222 


Qy 


358 


Db 


229 


Qy 


418 


Db 


236 


Qy 


478 


Db 


239 


Qy 


535 


Db 


272 


Qy 


595 


Db 


283 


Qy 


655 


Db 


287 


Qy 


715 



1:1 I: 



■ - IVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPS 177 

I: I III I II II III I I 1 1 1 1 : 1 1 : I 



I:: III :!! II lllllhll III II : I |: II |: 



M || MM I ::: I 



III I 



II: 

■-lenv 



II 

-til 



RDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITN 417 

II II II 

- npdpa eg 235 

VQRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCV 477 

I I 

: pkp 238 

ATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYARLGDTGRYTCIAST— PSGEA 534 

1:1:11 II : I I I: 

---rpavwlswkvsgpaapaqs ytalfrtqtapggq- 271 

TWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSY 594 

"II II :| 
-gap waeellag 282 

IIEAFSHASGSSWQTVaENVKTETSAIKGLKPNAIYLFLVRAaNAYGISDPSQISDPVKT 654 

. II: 

-wqsa 286 

QDVLPTSQGViJHKQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPS 714 

III :|l 
elgg lhw 293 

GANHGESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEA 774 

I :M I II :| II : : h 



Mon Jan 22 13:04:37 2001 



us-09-540-245a-18.rag 



Page 14 



Db 294 gqdyefkvrpssgrargpdsnvlllrlpekv 324 

Oy 775 PSAPPQGVTVSKHDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHINKTVDGST 834 

llllll II: MM : III III : M::: Ml III : II 
Db 325 psappqevtl--kpgngt-vfvswvpppaenhngiirgyqvwslgntslppanwtwgeq 381 

Qy 835 FSWIPFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVS- ■ -LAQQIS 891 

: I :M I IMII lllhl I I : |: : I :|: 
Db 382 tqleiathmpg-sycvqvaavtgagagepsrpvcllleqameratqepsehgpwtleqlr 440 

Qy 892 DWKQPAFIAGIGAACWIILMVFSIWLYRHEKKRNGLTSTYAGIRKVPSFTFTPTVTYQR 951 

:|:| II I I I::|: :: ::l I: I I 
Db 441 atlkrpeviatcgvalwllllgtavcihrrrrarvhl 477 

Qy 952 GGEAVSSGGRPGLLNISEPAA QPWLADTWPNTGNNHNDCS ISCCTAGNGN 1001 

ill : I llllll :■! : I I I : 

Db 478 gpglyrytsedailkhrmdhsdsqwladtwrstsgs-rdlssssslssrlg 527 



Db 



Db 



1002 SDSNLTTYSRPADCIANYNNQLDNKQTNL-MLPE-STVYGDVDLSNKINEMKTFN — S 1055 

:|: | || : : |:: : :||: || || : | |: : 
528 adar dpldcrrsl Is - wdsrspgvpllpdts tf ygs 1 iaelpsstparps 576 



1056 PNLKDGRFVNPSGQPTPYATTQLIQ • SNLSNNMNNGSGDSGEK HWKPLGQQKQ 1107 

I : I: : I II : I::: : I If: II ::|| 
577 pqv pavrrlppqlaqlsspcsssdslcsrrglssprlslapaeawk-akkkq 627 



Qy 1108 EVAPVQYNIVEQNKLNKDYRANDTVPPTIPyNQSYDQNTGGSYNSSDRGSSTSGSQGHKK 1167 

I: 11:1: II:: I l::l I 

Db 628 el qhanss-pll rgshs lelracel -gnr gskn 658 

Qy 1168 GARTPKVPKQGGMNWADL LPPPP-AHPPPHSNSEEYNISVDESY 1210 

:::| I : I I III I I I : |:: I 

Db 659 lsqspgavpqalvawralgpkllsssnelvtrhlppaplfphetpptqsqqtqppvapqa 718 

Qy 1211 DQEMPCPVPPAMLQQDELEEEEDERGPTPPVRGAASSPAAVSYSHQSTATLTPSPQEE 1270 

: I I : I I : I h I I |:::|: I |: 

Db 719 pssillpaapipil spcsppspqasslsgpspas-srlssssls-slged 766 

Qy 1271 LQPMLQDCPEETG- ■ -HMQHQPDRRRQPVSPPPPPRPISPPHTYGYISGPLVSDMDTDAP 1327 

:| III : : I II I II III llllll I h II 

Db 767 qdsvl--tpeevalclelsegeetprnsvs--pniprapsppttygylsvptasef-tding 821 

Qy 1328 EEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWGSASEEDNISSGR 1387 

: I I III II: lllllll III :l I 

Db 822 rtgggvgpkggvllcpprpcl tptps egslangwgsas-ednaasar 867 

Qy 1388 SS-VSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDMGRRHFHASQCPRPTSPVST 1446 

:| lllllll II M:|:| I : I : |: I III: 

Db 868 aslvsssdgsfladahfaralavavdsfgfglepre — adcvfidassppsprdeifl 923 

•1447 DSNMSAAVMQKTRPAKKLKHQPGHLRRETYTDDLPPPPVPPPAIKSPTAtSKTQLEVRPV 1506 
hi : : II : I :| II II I : ::ll I 

Db 924 tpnlslplwe-wrpdwledmevshtqrl — grgmppwpp- - -dsqissqrsqlhcr- - 973 



Qy 1507 WPK 1510 
:|| 

Db 974 -mpk 976 



RESULT 
W74152 



ID W74152 standard; Protein; 1257 AA. 



05-MAY-1999 (first entry) 

Human Ll cell adhesion molecule. 

Human Ll cell adhesion molecule; L1CAM; neurite growth; 
nervous system development; nerve regeneration; 
neuronal cell cohesive interaction. 



Homo sapiens. : 

DS5872225-A. : 

16-FEB-1999. ■ 

18-KQV-1994; 94US-0341843. 

26-JUN-1992; 92US- 0904991 . 
18-NOV-1994; 94OS-0341843. 

(OYCA-) UNIV CASE WESTERN RESERVE. 

Lemmon V; ' ■ v 

KPI; 1999-166719/14. 
N-PSDB; X01598. - 

Human Ll cell adhesion molecule • supports neurite outgrowth and is 
involved in nervous system development and repair 



Claim 1; Fig 3;v45pp; English. 

This sequence is the human Ll cell adhesion molecule (L1CAM) of the 
invention. LlCAM supports growth of neurites in vitro and is involved in 
development of the human nervous system and in nerve regeneration. It is 
useful in in vivo and in vitro experiments on nerve growth and 
regeneration. LlCAM mediates cohesive interactions of neuronal cells to 
each other and. to extracellular matrix. 

Sequence 1257 -AA; 



Query Match 8.7*; Score 761; DB 20; Length 1257; 

Best Local Similarity 24.9%; Pred. No. 4.3e-32; 

Matches 283; Conservative 159; Mismatches 422; Indels 274; Gaps 45; 

Qy 14 LLSLSPNHLFLAQLIPDPEDVERGNDHGTPIPTSDItDDNSLGYTGSRLRQEDFPPRIVEH 73 

II II ' II II: I : I: I I I hi 

Db 11 lllcsp clliqipeeyeghhvmeppvit eqsprrlvvf 48 

Qy 74 PSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKD DPRSHRMLLPSGS 125 

1:1 I :| hi hi I : I : :: I I : : 

Db 49 ptddi slkceasgkpevqfrwtrdgvhfkpkeelgvtvyqsphsgsftitgnn 101 

Qy 127 LFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSD VMVA 182 

I :: :hl I I I II hll |: :: : : I : II 

Db 102 snf --aqrfqgiyrcfasnklgtamsh----eirlmaegapkwpketvkpveve 149 

Qy 183 VGEPAVMECQPPRGHPEPT I SWKKDGSPLDDKDERI T I - RGGKLMIT YTRRSD - AGK YVC 240 

II h hll II :|lhh : II II hi 

Db 150 egesvvlpcnpppsaeplriywmnskilhikqdervtmgqngnlyfanvltsdnhsdyic 209 

Qy 241 — VGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAE FKCEAR 286 

II :'::| :| I I : I I : h :l I 

Db 210 hahfpgtrtiiqkep--idlrvkatnsmidrkprllfptnssshlvalqgqplvleciae 267 

Qy 287 GDPVPTVRWRKDDGELPKSRYEIRD-DHTLKIRKVTAGDMGSYTCVAENMVGKAEASATL 345 

I I lh: : 1:1 I :: : lh: II III hill :l I : : 
Db 268 gfptptikwlfpsgpmpadrvtyqnhnktlqllkvgeeddgeyrclaenslgsarhayyv 327 

Qy 346 TVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRF 405 

II: I::: lh : I I h I III : II I : :: 

Db 328 tveaapyvlhkpqshlygpgetarldcqvqgrpqpevtwring ipveelakdqky 382 

Qy 406 SVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQT 465 

: I I I ::lll II hll :: lh I : I :: III 
Db 383 ri-qrgalilsnvqpsdtmvtqcearnrhglllanayiywql — pakiltad--nqt 435 

Qy 466 -VAVDG-TFVI-SCVATGSPVPTILW-RKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGR 522 

:|| I I : l I I hill:: | :|[ III II I II : Mil 
Db 436 ymavqgs tay llcka f gapvps vqwldedgttv - lqder f f pyangtlg irdlqandtgr 494 
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Qy 523 YTC I AS T PS GEATWS AY I EVQ EFG VPVQPPRPT DPHLIPS — 562 

I hi: I I -I:: I II I ll:| II 

Db 495 yfclaandqnnvtimanlkvkdatqitqgprstiekkgsrvtftcqasfdpslqpsitwr 554 

Qy 563 APSKPEVTD 571 

I :: :| : 

Db 555 gdgrdlqelgdsdkyfiedgrlvihsldysdqgnyscvasteldwesraqllwgspgp 614 

Qy 572 VSRNT VTLSWQPNLNSGATPTSY HE- AFSHASG SSWQTVAENVKTETS 619 

:::: I :|l I : I III : I :: : :|| 
Db 615 vprlvlsdlhlltqsqvrvswspaedhnapiekydiefedkemapekwyslgkvpgnqts 674 

Qy 620 AIKGLKPNAIYLFLVRAAHAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQREI/3NAVLH 679 

I I II I I I II :|| :|: II : I III Ml: 
Db 675 ttlkispyvhytfrvtainkygpgepspvsetvvtpeaapeknpvdvkgegnettnmvi- 733 

Qy 680 LHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHGESDWLVFEVRTPAKNSWIP 739 
: I :| : :| I:: :|| I I | I :|: 
734 twkplrw-mdwnapqvq-yrvqwrpqgt— rgpwqeqivsdp-'flws 776 

Qy 740 DLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPP--QGVTVSKNDGNGTAILVS 797 

: I Mil ::::!::: : |: I I I :|: : I :|:|| 
Db 777 ntstfvpyeikvqavnsqgkgpepqvtigysgedypqaipelegieil — nssavlvk 832 

Qy 798 WQPPPEDTQNGMVQEYKV- -WCLGNETRY- ■ -HINR— -TVDGSTFSWIPFLVPGIRYS 849 

1:1 I :: I I I I:: :: Ihl I :| II:: I I I 
Db 833 wrpvdlaqvkghlrgynvtywregsqrkhskrhihkdhvwpanttsvllsglrpyssyh 892 

Qy 850 VEVAASTGAGSGVKSE PQFIQLDAHGN PVS 879 

:|| I I III II I: : I: I |:| 

Db 893 levqafngrgsgpaseftfstpegvpghpealhlecqsntslllrvqpplshngvltgyv 952 

Qy 880 — P EDQVSLAQQI SDWKQPAF IAG IGAACWI ILMVFSIWLYRHRKKRNGLT ST YAGI 935 

I I: hi :: I : : || : 

Db 953 lsyhpldeggkg-qlsfnlrdpel rthnltdlsphl 987 

Qy 936 RKVPSFTFT PTVT YQRG - GEA - VS SGG RPGLL NISEPAAQPWLADTW* PNTG 984 

• I : I I : I III I II I III I : : :| I I 
Db 988 r — yrfqlqattkegpgeaivreggtraalsgisdfgnisatagenyswswvpkeg 1041 



RESULT 14 
W42087 

ID W42087 standard; Protein; 1571 AA. 
XX 

IC W42087; 
28-SEP-1998 (first entry) 

XX 

DE Human Down syndrome-cell adhesion molecule DS-CAM2. 

XX 

KW DS-CAM2; Down syndrome-cell adhesion molecule; neural cell; 
KW signal transduction; trisomy 21; mental retardation; 
KW ' holoprosencephaly; corpus callosum agenesis; 
KW schizencephaly; diagnosis; assay; human. 
XX 

OS Homo sapiens. 
XX 

PN W09817795-A1. 
XX 

PD 30-APR-1998. 
XX 

PF 23-OCT-1997; 97WO-US19547. 
XX 

PR 25-OCM996; 96US-0029322. 
XX 

PA (CEDA-) CEDARS SINAI MEDICAL CENT. 
XX 

PI Korenberg JR; 
XX 

DR WPI; 1998-271791/24. 
DR N-PSDB; V31988, 
XX 



New isolated Down's Syndrome-cell adhesion molecule - used to 
develop products, for detection, diagnosis and therapy of 
developmental and neurological abnormalities 

Claim 2; Page 90-95; 109pp; English. 

This polypeptide, comprises Down syndrome-cell adhesion molecule 
DS-CAM2, an extracellular soluble protein belonging to a novel 
subclass of the ig superfamily with highest homology to neural cell 
adhesion molecules. Its amino acid sequence was deduced from cDNA 
clones (see V31982) isolated from a trisomy 21 foetal brain library. 
It is a splice variant of membrane-bound DS-CAMl (see W42086) , and 
lacks the entire transmembrane domain of DS-CAMl. The invention 
provides human and murine DS-CAM nucleic acid sequences (see also 
V31981, V319B5-87), expression vectors and host cells, transgenic 
animals, antibodies, antisense oligonucleotides, and primers 
derived from DS-CAM nucleic acids. DS-CAM polypeptides are associated 
with developmental and neurological processes. They can be used in 
e.g. neural prosthetic devices used in entubulation methods of 
repairing (regenerating) damaged or severed peripheral nerves, and 
also in bioassays to identify agonists and antagonists. The products 
can also be used in detection, diagnosis and therapy of developmental 
and neurological abnormalities such as Down syndrome, mental 
retardation, holoprosencephaly, agenesis of the corpus callosum, 
or schizencephaly. 



SQ Sequence 1571 AA; 



Query Match 8.1%; Score 708.5; DB 19; Length 1571; 

iest Local Similarity 26.84; Pred. No, 3.2e-29; 

Matches 286; Conservative 147; Mismatches 446; Indels 187; Gaps 48; 

! 

}y: 64 EDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDP RSHR 119 



II hi: h :H II :| I :| I III I 
403 edgtpkiisafsekvvspaepvslracnvkgtplptitw-- 



I III III 
-tldddpilkggshr 454 



120 — MLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSfiNASLEVAILRDDFRQNP 176 

h hi : : I I III I II I I : I : I I I 
455 isqmitsegnvlsylnisssqvr-dggvyrctannsag-vvlyqarinv---rgpasirp 509 

177 SDVMYAV-GEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITI-RGGKLMITYTRKS- 233 

: h I : h hi :l I h : I :: I I :: :| 
510 mknitaiagrdtyihcr-vigypyysikwyknsnllpfnhrqvafenngtlklsdvqkev 568 

234 DAGKYVCVGTNMVGERE - - - SEVAELTVLERPSFVKRPSNLAVT VDDSAEFKC - EARGDP 289 
I hi I h: : : |: :|| : I h: :: I || 

569 degeytc---nvlvqpqlstsqsvhvtv-kvppfiqpfefprfsigqrvfipcvvvsgdl 624 

290 VPTVRWRKDD3ELPKSRYEIRDD — HTLKIRKVTAGDMGSYTCVAENMVGKAEASATL 345 

h hll > :l I I: :hl :: hllhl I I : I 
625 pititwqkdgrpipgslgvtidnidftsslrisnlslmhngnytciarneaaavehqsql 684 

346 TVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFS YQP 398 

h II llhllll hi M I ! I I I: || :|| 

685 ivrvppkfwqprdqdgiygkavilncsaegypvptivwk fskgagvpqfqp 736 

399 PQSSSRFSVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITKA-YLEVTDVIADRPPPVI 457 

: I 'MM I II:!: I h ::h III : M 
737 ialngriqvlsngsllikhvveedsgyylckvsndvgadvsksmyltv kipami 790 

458 RQGPVNQTVAVDG-TFVLSCVATGSPVPTILWRKDGVLVSTQDSR- - IKQLENG V 509 

I I hi'' I :|| I I : I h ::: : :M II 
791 tsyp-nttlatqgqkkemsctahgekpiivrwekedriinpemarylvstkevgeevist 849 

510 LQIRYARLGDIGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEV 569 

III l ; :l ::l III : III II : h 

850 lqilptvredsgffschainsygedrgiiqltvqe ppdppeiei 893 

570 TDVSRNTVTLSWQPNLNSGATPTSYIIEAFSHASGSSWQTVAENVK TETSAIKGL 624 

II hll : : | | || : || : |:| :: I : 
894 kdvkartitl.rwtmgfdgnspitgydieckn--ksdswds-aqrtkdvspqlnsatiidi 950 
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Qy 625 KPNAIYLFLVRMNAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQRELGNAVLHLHNPT 684 

I:: I : I I I |:|| : : I : :|| 

Db 951 hpsstysirmyaknrigksepsn-eltitadeaapdgppqe vhle— 994 

Qy 685 VLSSSSIEVHWTVDQ-- -QSQriQGYKILYRPSGANHGESDWLVFEVRTPAKNSV-VIPD 740 

:M II I I : I: MM II | : : M M : : 

Db 995 pissqsirvtwkapkkhlqngiirgyqigyreystg-gnfqfniisvdtsgdsevytldn 1053 

Qy 741 LRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPPQGVTVSKNDGNGTAILVSWQP 800 

Ml:: II: III: II lh I : :| Ml 

Db 1054 lnkf tqyg 1 wqacnr agtgps sqei it tt ledvpsyppenvqa i at- - spes is is ws 1 1111 

Qy 801 PPEDTQNGMVQEYKV--WCLGNETRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTGA 858 

:: lh:| ::| I : I : : I: : I l|::| I I I 

Db 1112 lskealngilqgfrviywanlrodgelgeiknitttqpsleldglekytnysiqvlaftra 1171 

Qy 859 GSGVKSEPQFIQL ■ - DAHGNPVSPEDQVSLAQQI SDWKQPAFIAGIG AACWI ILMVFSI 916 

I 11:11 1:111 II: II III 

Db 1172 gdgvrseqiftrtkedvpgpp agvkaaaasasmvfvs 1208 



917 WLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGGRPGLINISEPAAQPWL 976 

II I II: III I I I Mil I I 
1209 wl--pplklngi irkytvfcshpyptv isefeasp-- 1241 

977 ADTW PNTGNN" -HNDCSISCCTAGNGNSDSNLTT-- -YSRPADCIANYNNQLDNKQ 1027 

I:: II I :: :: :|l III :| II : 

1242 -dsfsyripnlsrnrqysmvavtsagrgnsseiitveplakapariltfsgtvttpwm 1300 

1028 TNLMLPESTVYGDVDLSNKINEMKTFN-SPNLK-DGR— FVNPS 1067 

:M I I I : II I :|:| III I I I 
1301 kdivlpckav—gdpspavkwmkdsngtpslvtidgrrsifsngs 1343 



RESULT 15 

W42086 - 

ID W42086 standard; Protein; 1910 AA. i 
XX \ 

AC W42086; 
XX 

DT 28-SEP-1998 (first entry) 
XX 

DE Human Down syndrome-cell adhesion molecule DS-CAMl. 

XX I 

KW DS-CAMl; Down synjiome-cell adhesion molecule; neural cell; 

KW signal transduction; trisomy 21; mental retardation; 

KW holoprosencephaly; corpus callosum agenesis; 

KW schizencephaly; diagnosis; assay; human. 



f 



Homo sapiens. 



Key 

Peptide 



Region 
Region 



Region 
Region 



Location/Qualifiers 
1..23 

/label- Sig_peptide 
24.. 1910 

/label- Mat_protein 
24,. 887 
/label- IG 

/note- "immunoglobulin type-C2 domain" 
888.. 1594 
/label- FbN 

/note- "fibronectin type III domain" 
1595.. 1616 

/label- Transmembrane 
1617.. 1910 
/label- Cytoplasmic 
24.. 126 
/label- igl 
127.. 225 
/label- Ig2 
226.. 316 
/label- Ig3 
317,. 409 
/label- Ig4 



Region 



Region 
Region 
Region 



.410. .506 
■yiabel- Ig5 
'507.. 603 
/label- Ig6 
.'604.. 697 
■/label- lg7 
.698.. 792 
/label- igB 
•■793. .887 
./label- Ig9 
Disulfide-bond 46,. 102 
Disulfide-bond 145.. 197 
Disulfide-bond 246. .293 
Disulfide-bond 335.. 385 
Disulfide-bond 428.. 484 
Disulfide-bond 525.. 575 
Disulfide-bond '617.. 669 
Disulfide-bond :711. .766 
Disulfide-bond 
Disulfide-bond 
Modified-site 



Modified-site 



Modified-site 



Modified-site 



Modified-site 



Modified-site 
Modified-site 



Modified-site 
Modified-site 



^-glycosylated 
^-glycosylated 



865 
,1307.. 1359 
78., 80 

/note- "Asn is N-glycosylated 
106.. 108 

/note- "Asn is N-glycosylated' 
470,. 472 
/note- "Asn is 
487, .489 
/note- "Asn is 
658. .660 
/note- "Asn is N-glycosylated' 
666.. 668 
/note- "Asn is N-glycosylated' 
710. .712 
/note- "Asn is N-glycosylated' 
748. .750 

■/note- "Asn is N-glycosylated 1 
,795.. 7 97 ' 

'/note- "Asn is N-glycosylated 1 
924. .926 
/note- "Asn is N-glycosylated 1 
1142.. 1144 

/note- "Asn is N-glycosylated 1 
1160. .1162 

/note- "Asn is N-glycosylated 1 
•1250.. 1252 

'/note- "Asn is N-glycosylated 1 
1271.. 1273 
/note- "Asn is N-glycosylated' 
1324 . .1326 
/note- "Asn is N-glycosylated' 
.1341. .1343 
/note- "Asn is N-glycosylated 1 
1488.. 1490 
/note- "Asn is N-glycosylated 1 



23-OCM997; 97WO-US19547. 
25-OCT-1996; ' 96CS-QQ29322. 



(CEDA-) CEDARS SINAI MEDICAL CENT. 



Korenberg JR; 

WPI; 1998-271791/24. 
N-PSDB; V31981, 

New isolated Down's Syndrome-cell adhesion molecule - used to 
develop products for detection, diagnosis and therapy of 
developmental and neurological abnormalities 
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i 



Claim 2; Page 73-78; 109pp; English. 

This polypeptide -comprises Down syndrome-cell adhesion molecule 
DS-CAMl, a cell surface glycoprotein, belonging to a novel subclass 
of the Ig superfamily with highest homology to neural cell adhesion 
molecules. Its amino acid sequence was deduced from cDNA clones 
(see V319B1) isolated from a trisomy 21 foetal brain library. A 
splice variant, DS-CAM2 (see W42087 ) ) which is non-membrane bound 
was also identified. The invention also provides human and murine 
DS-CAM nucleic acid sequences (see also V31985-88), expression 
vectors and host cells, transgenic animals, antibodies, antisense 
oligonucleotides, and primers derived from DS-CAM nucleic acid. 
DS-CAM polypeptides are associated with developmental and 
neurological processes. They can beiused in e.g. neural prosthetic 
devices used in entubulation methods of repairing (regenerating) 
damaged or severed peripheral nerves, and also in bioassays to 
identify agonists and antagonists. The products can also be 
used in detection, diagnosis arid therapy of developmental and 
neurological abnormalities such 1 as Down syndrome, mental 
retardation, holoprosencephalyi agenesis of the corpus callosum, 
or schizencephaly. j 

Sequence 1910 AA; I 



Query Match 8.1%; Score 707.5; DB 19; Length 1910; 

Best Local Similarity 26.8%; Pred. No. 4.7e-29; 

"6; Conservative 147; Mismatches 446; Indels 187; Gaps 48; 



Qy 64 EDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDP— -RSHR 119 

II hi: I: :|l II :| I =1 I MM I III Ml 

Db 403 edgtpkiisafsekvvspaepvslmcnvkgtplptitw tldddpilkggshr 454 

Qy 120 —MLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDPRQNP 176 

I: I:: Ml :l I I 

Db 455 isqmitsegnvvsylnisssqvr-dggvyrctannsag-wlyqarinv-rgpasirp 509 

Qy 177 SDVMVAV-GEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITI-RGGKLMITYTRKS- 233 

: I: I : I: hi :l I h : I II:: :| 

Db 510 mknitaiagrdtyihcr-vigypyysikwyknsnllpfnhrqvafenngtlklsdvqkev 568 

Qy 234 DAGKYVCVGTNMVGERE- - -SEVAELTVLERPSFVKRPSNLAVTVDDSAEFKC-EARGDP 289 

■ I hi I I:: : : h :ll : I I::, :: I II 
Db 569 degeytc- - -nvlvqpqlstsqsvhvtv-kvppf iqpfefprf sigqrvf ipcvwsgdl 624 

a i 

■y 290 VPTVRWRKDDGELPKSRYEIRDD HTLKIRKVTAGDMGSYTCVAENMVGKAEASATL 345 

W \ -I: hll :| I '. I: :|:| r.) |:|||:| I I : I 

^ Db 625 pititwqkdgrpipgslgvtidnidftsslrisnlslmhngnytciarneaaavehqsql 684 

Qy 346 TVQEPPHFTVK PRDQ WALGRTVTFQCEATG NPQ PAI FWRREGSQNLLFS YQP 398 

I: II Mhlll I: I I I I I I I I: II :|| 

Db 685 ivrvppkfvvqprdqdgiygkavilncsaegypvptivwk fskgagvpqfqp 736 

Qy 399 PQSSSRFSVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITKA-YLEVTDVIADRPPPVI 457 

: I I MM I MM I h ::|: III : I :| 
Db 737 ialngriqvlsngsllikhweedsgyylckvsndvgadvsksmyltv kipami 790 

Qy 458 RQGPVNQT VAVDG - TFVLSCVATGSPVPT ILWRKDGVLVSTQDSR- - IKQLENG V 509 

I I hi I 'II I I : | |: ::: : :l : I I 
Db 791 tsyp-nttlatqgqkkemsctahgekpiivrwekedriinpemarylvstkevgeevist 849 

Qy 510 LQIRYAKLGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEV 569 

III hi ::| III : III II : h 

Db 850 lqilptvredsgffschainsygedrgiiqltvqe ppdppeiei 893 

Qy 570 TDVSRNTVTLSWQPNLNSGATPTSYIIEAFSHASGSSWQTVAENVK TETSAIKGL 624 

II hll I : : I I II : II : h I :: I : 
Db 894 kdvkartitlrwtmgfdgnspitgydieckn--ksdswds-aqrtkdvspqlnsatiidi 950 



Db 



685 VLSSSSIEVHWTVDQ— QSQYIQG YK I LYRPSGANHGESDWLVFEVRT PAK NSV - VI PD 740 

:|| II I 1 : I: hlhl II I : : I I : I : : 
995 pissqsirvtwkapkkhlqngiirgyqigyreystg-gnfqfniisvdtsgdsevytldn 1053 



Qy 741 LRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPPQGVTVSKNDGNGTAILVSWQP 800 

III:': II: III: II lh I : :| :ll 
Db 1054 lnkftqyglvvqacnragtgpssqeiitttledvpsyppenvqaiat--spesisiswst 1111 

Qy 801 PPEDTQNGMVQEYKV--WCLGNETRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTGA 858 

:: l|::| ::| I : I ; : |; ; I ||;:| I I I 
Db 1112 lskealngilqgfrviywanlmdgelgeiknitttqpsleldglekytnysiqvlaftra 1171 

Qy 859 GSGVKSEPQFIQL--DAHGNPVSPEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSI 916 

I Ihll I : I II II: II III 

Db 1172 gdgvrseqiftrtkedvpgpp agvkaaaasasmvfvs 1208 



917 WLYRHRKKRNSLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGGRPGLLNISEPAAQPWL 976 

II I lh III I .1 I III I I 

1209 wl--pplklngi irkytvfcshpyptv- isefeasp-- 1241 



977 ADTW----PNTGNN-HNDCSISCCTAGNGNSDSNLTT— YSRPADCIANYNNQLDNKQ 1027 

I:: II I' :: :: :|| III :| II : 
1242 -dsfsyripnlsrnrqysvwvvavtsagrgnsseiitveplakapariltfsgtvttpwm 1300 

1028 TNLMLPESTVYGDVDLSNKI NEMKTFN- SPNLK- - DGR — FVNPS 1067 

:::H I I I : II I Ml III III 
1301 kdivlpckav---gdpspavkwmkdsngtpslvtidgrrsifsngs 1343 



Search completed: January 22, 2001, 12:19:37 
Job time: 1734 sec 



Qy 

Db 



625 KPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQRELGNAVLHLHNPT 684 

Ml Mil hll : : I : M 
951 hpsstysirmyaknrigksepsn-eltitadeaapdgppqe vhle— 994 
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30 


596.5 


6.8 


1907 


2 


S50893 


protein-tyros ine-p 




GenCore version 4,5 


31 


589.5 


6.8 


1010 


2 


JU0094 


Fll protein orecir 




Copyright (c) 1993 • 2000 Compugen Ltd. 


32 


589.5 


6.8 


1091 


2 


501998 


contactin precursc 






33 


585 


6.7 


1232 


2 


T43027 


neural cell adhesi 






34 


584 


6.7 


1912 


2 


A56178 


protein*tyrosine"p 


OM protein - protein search, using sw model 


35 


583 


6.7 


2029 


1 


TDFFLK 


protei n • ty ros ine * d 






36 


581 


6.7 


1894 


2 


C54689 


protein- tyros ine-p 


Run on: 


January 22, 2001, 12:26:10 ; Search time 325.28 Seconds 


37 


573 


6^6 


1020 


2 


S05944 


neuronal cell surf 




(without alignments) 


38 


571.5 


6,6 


1021 


2 


A57U2 


contactin precurso 




344.638 Million cell updates/sec 


39 


570 


6.5 


1256 


2 


T03096 


CDO protein - rat 






40 


555.5 


6.4 


1898 


2 


S46216 


leukocyte antigen - 


Title: 


■ US-09-540-245A-18 


41 


551.5 


6.3 


1240 


2 


T03097 


CDO protein - huma 


Perfect score: 


8724 


42 


549.5 


6.3- 


7962 


2 


138346 


elastic titin • hu 


Sequence: 


1 MKWKHVPFLVMISLLSLSPN VLGGYERGEDNNEELEETES 1651 


43 


549 


6.3 


1897 


1 


TDHULK 


leukocyte antigen - 






44 


548.5 


6.3 


1265 


1 


A37967 


neural cell adhesi 


Scoring table: 


BLQSUM62 


45 


548 


6.3 


1197 


2 


T30581 


neural cell adhesi 



Gapop 10,0 , Gapext 0.5 

|^irched: . 195891 seqs, 67900655 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq- length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
'. Maximum Match 100% 
r Listing first 45 summaries 



Database : 



PIR.66:* 
pirl:* 
pir2:* 
'pir3:* 
pir4:* 



Pred. No. is the number of results predicted by chance to have a 
, score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



I 

Query 



NO. 


Score 


Match Length D 


I ID 


Description 


Be 


st Local Similarity 94.51; Pred, No. 0; 














Ma 


tches 1560; Conservative 42; Mismatches 49; Indels 0; Gaps 




8315 


95,3 


1651 


T14160 


transmembrane rece 








8120 


93,1 


1612 


T30805 


duttl protein - mo 


Qy 


1 MKWKHVPFLVMISLLSLSPNHLFLAQLIPDPEDVERGNDHGTPIPTSDNDDNSLGYTGSR 60 


>! 


2607.5 


29.9 


1344 


T14316 


rig-1 protein - mo 




llllhl llllllhll II lllllllllhllllhlll IIIIIIIIIIMIMI 




1515 


17.4 


1273 


T42405 


sax-3 protein - Ca 


Db 


1 MKWKHLPLLVMISLLTLSKKHLLLAQLIPDPEDLERGNDNGTPAPTSDNDDNSLGYTGSR 60 




852 


9.8 


423 


T29549 


hypothetical prote 






6 


764.5 


8.8 


1260 


S05479 


neural cell adhesi 


Qy 


61 LRQEDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYRGGERVETDKDDPRSHRM 120 


7 


761 


8.7 


1257 


A41060 


neural cell adhesi 




IIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 


8 


757.5 


8,7 


1443 


150600 


neogenin - chicken 


Db 


61 LRQEDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRM 120 


9 


739.5 


8.5 


1259 


S36126 


neural cell adhesi 






10 


726.5 


8,3 


1427 


151669 


tumor suppressor • 


Qy 


121 LLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVM 180 


11 


713 


8,2 


1259 


A43425 


Bravo/Nr-CAM cell 




rillllllllllllllllllllllMIIIMIIIIIIIIIIIIIIIIMIIIIIIIIIII 


12 


707.5 


8.1 


1896 


T08851 


Down syndrome cell 


Db 


121 LLPSGSLFFLRIVHGRKSRPDEGVYICVARNYLGEAVSHNASLEVAILRDDFRQNPSDVM 180 


13 


702.5 


8.1 


1028 


158164 


BIG-1 protein - ra 






14 


695.5 


8.0 


1028 


A53449 


plasmacytoma -assoc 


Qy 


181 VAVGEPAVME-QPPRGHPEPTISWKKDGSPLDDKDERITIRGGKLMIiyTRKSDAGKYVC 24 0 


15 


693 


7.9 


1268 


A39640 


neural cell adhesi 




llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 


16 


684.5 


7,8 


874 


T29548 


hypothetical prote 


Db 


181 VAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITIRGGKLMITYTRKSDAGKYVC 240 


17 


678.5 


7.8 


2222 


T13924 


sdk protein • frui 






18 


667.5 


7.7 


1447 


A54100 


tumor suppressor p 


Qy 


241 VGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDG 300 


19 


653.5 


7.5 


1040 


A49356 


transient axonal g 


llllllllll|:||::MIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIII MINI 


20 


652 


7,5 


1040 


A34695 


axonal glycoprotei 


Db 


241 VGTNMVGERESRVADVTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTFGWRKDDG 300 


21 


644.5 


7.4 


5175 


T20992 


hypothetical prote 






22 


644.5 


7.4 


5198 


T43290 


hemicentin precurs 


Qy 


301 ELPKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVOEPPHFWKPRDQ 360 


23 


637 


7.3 


1277 


T30532 


neural cell adhesi 


IIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 


24 


633.5 


7.3 


1036 


S22383 


axon in 1 precursor 


Db 


301 ELPRSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFVYKPRDQ 360 


25 


629.5 


7,2 


1272 


S26180 


neurofascin - chic 






26 


628 


7.2 


1239 


A32579 


neuroglian • fruit 


Qy 


361 WALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQR 420 


27 


618.5 


7.1 


1209 


2 T42718 


probable neural ce 




IIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIhlllll 


28 


613 


7.0 


1018 


JC4211 ' 


neural adhesion pr 


Db 


361 WALGRTVTF3CEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTVTNVQR 420 


29 


596.5 


6.8 


1018 


A54744 


contactin 1 precur 







RESULT 1 
T14160 

transmembrane receptor protein Robol - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 20-Sep-1999 Isequencejrevision 20-Sep-1999 ftext.change 20-Sep-1999 
C; Accession: T14160 

:;Kidd, T.; Brose, K.; Mitchell, K.J.; Fetter, R.D.; Tessier-Lavigne, M.; Goodman, C. 
Cell 92, 205-215, 1998 

A;Title: Roundabout controls axon crossing of the CNS midline and defines a novel sub 
A;Reference number: Z17897; MUID: 98117249 
A; Accession: T14160 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1651 <KID> 

A;Cross-references: EMBL:AF041082; NID:g2811215; PID:g2811216; PIDN:AAC39960.1 
C; Function: 

A; Description: appears to function as the gatekeeper controlling midline crossing 
C; Keywords: transmembrane protein 



95.34; Score 8315; DB 2; Length 1651; 
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• 

Qy 

Db 

Qy 

Db 

Qy 
Db 
Qy 



421 SDVGYYICOTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVATG 480 

421 SDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQIVAVDGILTLSCVATG 480 

481 SPVPTILWRKDGVLVSTQDSRIRQLENGVLQIRYAKLGDTGRYTCIASTPSGEAIWSAYI 540 

481 SPVPTILWRKDGVLVSTQDSRIKQLESGVLQIRYAKLGDIGRYTCTASTPSGEATWSAYI 540 

541 EVQEFGVPVQPPRPTDPNLIPSAPSRPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEAFS 600 

541 EVOEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSKNTVTLLWQPNLNSGATPTSYIIEAFS 600 

601 HASGSSfjQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPT 660 

601 HASGSSWQTVAENVKTETFAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVPPT 660 

661 SQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHFTVDQQSQYIQGYKILYRPSGANHGE 720 

661 TQGVDHKQVQRELGNWLHLHNPTILSSSSVEVHWTVDQQSQYIQGYKILYRPSGASHGE 720 

721 SDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPPQ 780 

721 SEWLVFEVRTPTKNSVVIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEERPSAPPR 780 

781 GVTVSKNDGNGTAILVSWOPPPEDTQNGMVQEYKVWCLGNETRYHINKTVDGSTFSWIP 840 

781 SVTVSKNDGNGTAILVTWOPPPEDTQNGHVQEYKVWCLGNETRYHINKTVDGSTFSWIP 840 

841 FLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVSLAQQISDWKQPAFI 900 

841 FLVPGIRYSVEVAASTGAGPGVKSEPQFIQLDSHGNPVSPEDQVSLAQQISDWKQPAFI 900 

901 AGIGAACWIILMVFSIWLYRHRKKRNGLTSTYAGIRRVPSFTFTPTVTYQRGGEAVSSGG 960 

901 AGIGMCWIILMVFSIWLYRHRKKRNGLSSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGG 960 

961 RPGLLNISEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYK 1020 

961 RPGLLNISEPATQPWLADTWPNTGNSHNDCSINCCTASNGNSDSNLTTYSRPADCIANYN 1020 

1021 NQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLRDGRFVNPSGQPTPYATTQLIO 1080 
I I II I III II I MMII III I III II I II Ml I III III I II 1 11 ! Ill III III Mllll 
1021 NQLDNKQTNLMLPESTVYGDVDLSNKINEMRTFNSPNLKDGRFVNPSGQPTPYATTQLIQ 1080 

1081 SNLSNNMNNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYNO 1140 

1081 ANLINlMNGGGDSSEKHWKPPGQQRQEVAPIQYIIIIffiQNKLNKDYRANDTILPTIPYNH 1140 

1141 SYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKVPKQGGMNWADLLPPPPAHPPPHSNSE 1200 

1141 SYDQNTGGSYNSSDRGSSTSGSQGHKRGARTPRAPRQGGMNWADLLPPPPAHPPPHSNSE 1200 

1201 EYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPTPPVRGAASSPAAVSYSHQST 1260 

1::IMMMMMMII!IIMIMIMMI! M I III II I III IM III III III 
1201 EYSMSVDESYDQEMPCPVPPARMYLQQDELEEEEAERGPTPPVRGAASSPAAVSYSHQST 1260 

1261 ATLTPSPQEELQPMLQDCPEETGHMQHQPDRRRQPVSPPPPPRPISPPHTYGYISGPLVS 1320 



I 

1261 ATLTPSPQEELQPMLQDCPEDLGHMPHPPDRRRQPVSPPPPPRPISPPHTYGYISGPLVS 1320 
1321 DMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWGSASEE 1380 
1321 DMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWGSASEE 1380 
1381 DNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRP 1440 
1381 DNISSGRSSVSSSDGSFFTDADFAOAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRP 1440 
1441 TSPVSTDSNMSAAVMQKTRPAKKLRHQPGHLRRETYTDDLPPPPVPPPAIKSPTAQSKTQ 1500 
1441 TSPVSTDSNMSAAVIQRARPTKKOKHQPGHLRREAYTDDLPPPPVPPPAIKSPSVQSRAQ 1500 



Qy 1501 LEVRPVWPRLPSMDARTDRSSDRRGSSYKGREVLDGROWDMRTNPGDPREAQEQQNDG 1560 



Db 1501 LEARPIMGPRLASIEARAD^SSD^RGGSYKG^ 1560 

Qy 1561 KGRGNKAARRDLPPAKTHLIQEDILPYCRPTFPTSNNPRDPSSSSSMSSRGSGSRQREOA 1620 

Db 1561 KARGTKTARRDLPPARTHLIPEDILPYCRPTFPTSNNPRDPSSSSSMSSRGSGSRQREQA 1620 

Qy 1621 NVGRRNIAEMQVLGG YERGEDNNEELEET ES 1651 

Db 1621 NVGRRNMAEMQVLGGFERGDENNEELEETES 1651 



RESULT 2 
T30805 

duttl protein ■ mouse 

N; Alternate names: transmembrane receptor protein Robol homolog 
C; Species: Mus muscuius (house mouse) 

C;Date: 22-Oct-1999 fsequence_revision 22-Oct-1999 Stext.change 22-Oct-1999 
C; Accession: T30805 ■ 

R;Wu, M.C.; Lowe, N,; Fordham, R.; Rabbitts, P. 
submitted to the EMBL'Data Library, July 1998 

A; Description: The mouse homologue of human DDTTl/H-robol gene: protein sequence and 
A; Reference number: Z20879 
A; Accession: T30805 

A;Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-1612 <WUM> 

A;Cross-references: EMBL:Y17793; NID:el329712; PID:el329713; PIDN:CAA76850.1 

A; Experimental source: brain 

C; Genetics: 

A; Gene: duttl 

A; Map position: 16 - 



Query Match 93.1%; Score 8120; DB 2; Length 1612; 

Best Local Similarity 95.6*; Pred. No. 0; 

Matches 1525; Conservative 33; Mismatches 37; Indels 0; Gaps 

Qy 57 TGSRLRQEDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPR 116 

Db 18 SGSRLRQEDFPPRIVEHPSDLIVSRGEPATLNCKAEGRPTPTIEWYKGGERVETDRDDPR 77 

Qy 117 SHRMLLPSGSIiFFLRIVHGRRSRPDEGVYVCYARNYLGEAVSHNASLEVAILRDDFRQNP 176 

Db 78 SHRMLLPSGSLFFLRIVHGRRSRPDEGVYICVARNYLGEAVSHNASLEVAILRDDFRQNP 137 

Qy 177 SDVMVAVGEPAVMECQPPRGHPEPTISWRRDGSPLDDKDERITIRGGRLMITYTRKSDAG 236 

Db 138 SDVMVAVGEPAVMECQPPRGHPEPTISWKRDGSPLDDRDERITIRGGKLMITYTRKSDAG 197 

Qy 237 KYVCVGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWR 296 

Db 198 RYVCVGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWR 257 

Qy 297 KDDGELPRSRYEIRDDHTLRIRKVTAGDMGSYTCVAENMVGRAEASATLTVQEPPHFWK 356 

Db 258 KDDGELPKSRYEIRDDHTLRIRRVTAGDMGSYTCVAENMVGRAEASATLTVQEPPHFWK 317 

Qy 357 PRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTIT 416 

Db 318 PRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTIT 377 

Qy 417 NVQRSDVGYYICQTLNVAGSIITRAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSC 476 

Db 378 NVQRSDVGYYI.CQTLNVAGSIITRAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTLILSC 437 

Qy 477 VATGSPVPTILWRRDGVLVSTQDSRIRQLENGVLQIRYARLGDTGRYTCIASTPSGEATW 536 

Db 438 VATGSPAPTILWRRDGVLVSTQDSRIRQLESGVLQIRYARLGDTGRYTCTASIPSGEATW 497 

Qy 537 SAYIEVQEFGVPVQPPRPTDPNLIPSAPSRPEVTDVSRNTVTLSWQPNLNSGATPTSYII 596 

Db 498 SAYIEVQEFGVPVQPPRPTDPNLIPSAPSRPEVTDVSKNTVTLSWQPNLNSGATPTSYII 557 
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7 EAFSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQD 656 

8 eafshasgsswqtaaenvktetfaikglkpnaiylflvraanaygisdpsqisdpvktod 617 
17 vlptsqgvdhkqvqrelgnavlhlhnptvlssssievhwtvdqqsqyiqgykilyrpsga 716 

vppt sqgvdhkqvqrelgnwlhlhnpt ilssssvevhwtvdqqsqy iqg ykilyrpsga 677 
nhgesdwlvfevrtpaknswipdlrkgvnyeikarpffnefqgadseikfartleeaps 776 
shgesewlvfevrtptknswipdlrkgvnyeikarpffnefqgadseikfaktleeaps 737 
appogvtvskndgngtailvswqpppedtqngmvqeykvwclgnetryhinktvdgstfs 836 
apprsvtvskndgngtailvtkqpppedtqngmvqeykwclgnetkyhInm 797 

7 wipflvpgirysvevaastgagsgvksepqfiqldahgnpvspedqvslaqoisdwkq 896 

WIPSLVPGIRYSVEVAASTGAGPGVKSEPQFIQLDSHGNPVSPEDQVSLAQQISDWRQ 857 
PAFIAGIGMCWIILMVFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAV 956 
PAFIAGIGAACWIILMVFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAV 917 
SSGGRPGLLNISEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCI 1016 
SSGGRPGLLNISEPATQPWLADTWPNTGNNHNDCSINCCTAGNGNSDSNLHYSRPADCI 977 
ANYNNQLDNKQTNliLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGOPTPYATT 1076 

8 ANYNNQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATT 1037 

7 QLIQSNLSNNMKNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRANDTVPPTI 1136 

8 QLIQANLSNNMNNGAGDSSEKHWKPPGQQKPEVAPIQYNIMEQNKLNKDYRANDTIPPTI 1097 

7 PYNQSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKVPKQGGMHWADLLPPPPAHPPPH 1196 

8 PYNQSYDQNTGGSYNSSDRGSSTSGSOGHKKG^ 1157 

7 SNSEEYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPTPPVRGAASSPAAVSYS 1256 

8 SNSEEYNMSVDESYDQEMPCPVPPAPMYLQQDELQEEEDERGPTPPVRGAASSPAAVSYS 1217 



HQSTATLIPSPQEELQPMLQDCPEETGHMQHQPDRRRQPVSPPPPPRPISPPHTYGYISG 1316 

HQSTATLTPSPQEELQPMLQDCPEDLGHMPHPPDRRRQPVSPPPPPRPISPPHTYGYISG 1277 

i 

PLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWGS 1376 

PLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWGS 1337 

ASEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQ 1436 

ASEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQ 1397 

CPRPTSPVSTDSNMSAAVMQKTRPAKKLKHQPGHLRREIYTDDLPPPPVPPPAIKSPTAQ 1496 

CPRPTSPVSTDSNMSAWIQKARPAKKQRHQPGHLRREAYADDLPPPPVPPPAIKSPTVQ 1457 

SKTQLEVRPVWPKLPSMDARTDRSSDRKGSSYKGREVLDGRQWDMRTNPGDPREAQEQ 1556 

SKAQLEVRPVMVPKLASIEARTDRSSDRKGGSYKGREALW^ 1517 

QNDGKGRGNKMKRDLPPMTHLIQEDILPYCRPTFPTSNNPRDPSSSSSMSSRGSGSRQ 1616 

PNDGKGRGTRQPKRDLPPAKTHLGQEDILPYCRPTFPTSNNPRDPSSSSSMSSRGSGSRQ 1577 
REQANVGRRNIAEMQVLGGYERGEDNNEELEETES 1651 
REQANVGRRNMAEMQVLGGFERGDENNEELEETES 1612 



RESULT 3 
T14316 

rig-1 protein - mouse 

C; Species: Mus musculus (house mouse) 

C;Date; 20-Sep-1999..#sequence_revision 20-Sep-1999 ttext_change 20-Sep-1999 
C; Accession: T14316 '■- 

R;Yuan, S.S.F.; Cox', L.A.; Dasika, G,K.; Lee, E.Y.H.P. 
submitted to the EMBL- Data Library, April 1998 
A; Reference number: Z17975 
A; Accession: T14316 ; 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-1344 <YtfA> 

A; Cross -references: EMBL: AF060570; NID:g4206385; PID:g4206386; PIDN;AAD11628.1 

Query Match ,' 29.9%; Score 2607.5; DB 2; Length 1344; 

Best Local Similarity 38,2%; Pred. No. 3.2e-115; 

Matches 594; Conservative 204; Mismatches 456; Indels 299; Gaps 31; 

Qy 58 GSRLRQEDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRS 117 

Db 32 GSRVGPEDAMPRIVEQPPDLWSRGEPATLPCRAEGRPRPNIEWYKNGARVATAREDPRA 91 

Qy 118 HRMLLPSGSL.FFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRONPS 177 

Db 92 HRLLLPSGALFFPRIVHGRRSRPDEGVYTCVARNYLGAAASRNASLEVAVLRDDFRQSPG 151 

Qy 178 DVMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITIRGGKLMITYTRKSDAGK 237 

Db 152 NVWAVGEPAVMECVPPKGHPEPLVTWKKGKIKLKEEEGRITIRGGKLMMSHTFKSDAGM 211 

Qy 238 YVCVGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRK 297 

1:11 :N lllll III lllllll::ll III Ml :||| | : I 
Db 212 YMCVASNMAGERESGAAELWLERPSFLRRPINQWLADAPVNFLCEVQGDPQPNLHWRK 271 

Qy 298 DDGELPKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKP 357 

Illlll lllll 11:1 I M: I hlllllll MM :|:| II II II 
Db 272 DDGELPAGRYEIRSDHSLWIDQVSSEDEGIYTCVAENSVGRAEASGSLSVHVPPQFVTKP 331 

Qy 358 RDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITN 417 

"I II I hllll III IMII::IIII lllll I II I Ml 
Db 332 QNQTVAPGAKVSFQCETKGNPPPAIFWQKEGSQVLLFPSQSLQPMGRLLVSPRGQLNITE 391 

Qy 418 VQRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCV 477 

I: I llOll ::|llll: II lh I Ihl III III: : ; I I 
Db 392 VKIGDGGYYVCQAVSVAGSILAKALLEIKGASIDGLPPIILQGPANQTLVLGSSVWLPCR 451 

Qy 478 ATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWS 537 

Ul I IM:II : II: ::|l II : II hhl : |||||: 
Db 452 VIGNPQPNIQUKKDERWLQGDDSQFNLMDNGTLHIASIQEMDMGFYSCVAKSSIGEATWN 511 

Qy 538 AY IEVQE - FGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTS Y I I 596 

::: II :'l I Ihl Ihl Ihh Mlhhll Ml MM 

Db 512 SWLRKQEDWG ■ ■ ASPGPAIGPSHPPGPPSQPIVTEVTANSITLTWKPNPQSGATATSYVI 569 

Qy 597 EAFSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQD 656 

Mil I :l = : 1 : 1 1 1 : h II I Ihll llllllll MM: : |:|| : ||| 
Db 570 EAFSQAAGNTWRTVADGVQLETYTISGLQPNTIYLFLVRAVGAWGLSEPSPVSEPVQTQD 629 

Qy 657 VLPTSQGVDHSQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGA 716 

: I-: Ill : : Mil :::| Mil I :||::: :| :| 
Db 630 SSLSRPAEDP'^KGQRGLAEVAVRMQEPTVLGPRTLQVSWTVDGPVQLVQGFRVSWRIAGL 689 

Qy 717 NHGESDWLVFSVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIRFAKTLEEAPS 776 

: I I : ::M II h I I :|| : I Ihl lllll 
Db 690 DQG-SWTMLDLQSPHKQSTVLRGLPPGAQIQIKVQVQGQEGLGAESPFVTRSIPEEAPS 747 

Qy 777 APPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKWCLGNETRYHINKTVDGSTFS 836 

lllll h ' ::: MM :||:: I ! : : 1 1 II 1 1 : 1 : ! : I : : I I 
Db 748 GPPQGVAVALGGDRNSSVTVSWEPPLPSQRNGVITEYQIWCLGNESRFHLNRSAAGWARS 807 
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Qy 837 WIPFLVPGIRYSVEVAASTGAGSGVKSEPQFIQL-DAHGNPVSPEDQVSLAQQISDW 894 

■ I hll I III:! II I! I I :U I II II:::: h 

Db 808 VTFSGLLPGQIYRALVAAATSAGVGVASAPVLVQLPFPPAAEP-GPEVSEGLAERLAKVL 866 

Qy 895 KQP AF IAG IG AACWI I LMVFS IWLY RHRKKRNGLTSTYAG I RKVPSFTFT PTVTYQRGGE 954 

;:|||:|l III :|: I III :|:| |: I il :|| h: 
Db 867 RKPAFLAGSSAACGALLLGFCAALYRRQKQRKELSHYTA SFAYTPAVSFPHSEG 920 

Qy .955 AVSSGGRP- -GLLNISEPAAQPWLADTWPNTGNNHN- -DCSISCCTAGNGNSDSNLTTYS 1010 

I II II III llllhll: : : : III : II 
Db 921 LSGSSSRPPMGir-GPAAYPWIADSWPHPPRSPSAQEPRGSCCPS— NPD 966 

Qy 1011 RPADCIANYNNQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQP 1070 

|: I : :| : : I I ||: 

Db 967 PDDRYYNEAGISLYL AQTARGANASGEG 994 

Qy 1071 TPYATTQLIQSNLSNNMNNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRAND 1130 

hi |:|:: I 
Db 995 PVYSTID PVGEELQ 1008 

IQy 1131 TVPPTIPYNQSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKVPRQGGMNWADLLPPPP 1190 
I I : I I :| I : lh:l :l I : ::| : lllll 
lb 1009 TFHGGFPQHSSGDPSTWSQYAPPEWSEGDSGARG-GQGKLLGKPVQMPSLSWPEAIPPPP 1067 

Qy 1191 AHPPPHSNSEEYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPTPPVR 1244 

I: I! I ::lh I III 
Db 1068 P SCELSCPEGP EEELKGSSDLEEWCPPVPEKSHLV 1102 



Qy 1245 GAASSPAAV SYSHQSTATLTPSPQEELQPMLQDCPEETGHMQHQPDR 1291 

|::|| I : II llllllllll : II I : |: I 
Db 1103 GSSSSGACMVAPAPRDTPSPTSSYGQQSTATLTPSPPDPPQP PTDIPHLHQMP" 1155 

, Qy 1292 RRQPVSPPPPPRPISPPHTYGYISGPLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGL 1351 

II I: I I 

Db 1156 RRVPLGPSSP 1165 

Qy 1352 EQTPASSVGDLESSVTGSMINGWGSASEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAA 1411 

:| : ::M II 

Db 1166 LSVSQPALSSHDG 1178 

Qy 1412 EYAGLKVARRQMQDAAGR-RHFHASQCPRPTSPVSTDSNMSAAVMQKTRP AK 1462 

I : II :lll I h: I : I I I 

Db 1179 RPVGLGAGPVLSYHASPSPVPSTASSAPGRTRQVTGEMTPPLHGHRARIRR 1229 

Qy 1463 KLKHQPGHLRRETYTDDLPPPPVPPPAIKSPTAQSKTQLEVRPWVPKLPSMDARTDRSS 1522 

III III 111111:111 :: I III: II 

Db 1230 KPKALP- -YRREHSPGDLPPPPLPPPELRDKLALGSA- -GSRQHVFPR ARAQWGE 1280 

■ Qy 1523 DR-KGSSYKGREVLDGRQWDMRTNPGDPREAQEQQNDGKGRGNKAAKRDLPP 1574 

: II: : II :|:| hill :| : I 
Bb 1281 ESGAGSASRG PTSSQRGPHPDGKESQ GRGRGLEACRSPNSP 1321 



RESULT 4 
T42405 

sax- 3 protein - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 03*Dec-1999 #sequence_revision 03-Dec-1999 ttext_change 21-M-2000 
C; Accession: T42405 

RjZallen, J. A.; Yi, B.A.; Bargmann, C.I. 
Cell 92, 217-227, 1998 
A; Title: The conserved immunoglobulin superfamily member SAX-3/Robo directs multiple asf 
A;Reference number; 222X60; MOID: 98117250 
Accession: T42405 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-1273 <ZAL> 

A; Cross -references: EMBL:AF041053; NID:g2804779; PIDN:AAC38848.1; PID:g2804780 
C; Genetics: 
A; Note: sax-3 
C; Function: 

A; Description: sax-3 function is required at the time of axon guidance 



Query Match 17.41; Score 1515; DB 2; Length 1273; 

Best Local Similarity 33.5%; Pred, No, 8.2e-64; 

Matches 401; Conservative 164; Mismatches 469; Indels 164; Gaps 34; 

Qy 68 PRIVEHPSDLIVSRGEPATLNCKAEGRPTPTIEWYRGGERVETDRDDPRSHRMLLPSGSL 127 

I hill I : .: f I : I llllll h I I III h I hh Ml:: :||| 
Db 31 PVIIEHPIDVWSRGSPATLNCGAK-PSTAKITWYKDGQPVITNKEQVNSHRIVLDTGSL 89 

Qy 128 FFLRIVHGRKSR-PDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEP 186 

I h: h : III III I II h lh:hlhlll I I II 
Db 90 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGEM 149 

Qy 187 AVMECQPPRGHPEPTISWKKDGSPLDDKD-ERITIRG-GKLMITYTRKSDAGKYVCVGTN 244 

1:1 1111,111 :ll:ll I :| I |: I hi :lhl I II I 
Db 150 AVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 209 

Qy 245 MVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDGELPK 304 

lllll I 11:1 hi I : I :: I I : I I III I : |:: : :| 
Db 210 MVGERVSNPARLSVFERPRFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWRRRNEPMPV 269 

Qy 305 SR-YEIRDDHTLKIRRVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPRDQWA 363 

:| I :|:< hi :| I I I I I I I Nil I II II I II II I 
Db 270 TRAYIARDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVP 329 

Qy 364 LGRTVTFQCEATGNPQPAIFWRREGSQNLLF-SYQPPQSSSRFSVSQTGDLTITNVQRSD 422 

I I 11:1 llllll :ll hill I- : I II II III I- I 
Db 330 AGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSY--VSADGRTKVSPTGTLTIEEVRQVD 387 

Qy 423 VGYY ICQTLNVAGSI ITKAYLEVTDVIAD RPPPVIRQGPVNQTVAVDGTFVLSCV 477 

I hi :| III ::H hll MM I I llh I : :| I 

Db 388 EGAYVCAGMNSAGSSLSKAALKVTTKAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQ 447 

Qy 478 ATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWS 537 

|:| II I i :lh : INI I III I III lllll Ihlll 
Db 448 ASGKPTPGISHLRDGLPIDITDSRISQHSTGSLHIADLKRPDTGVYTCIAKNEDGESTWS 507 

Qy 538 AYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATP-TSYII 596 

I : h: II II: 1 1 : 1 : : I : :|: I I I lllll III 

Db 508 ASLTVEDHTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYII 567 

Qy 597 EAFSHASGSSWQTVAENVKTETSAIRGIRPNAIYLFLVRAANAYGISDPSQISDPVKT-- 654 

: :| I :i : : I : llllll: hh:ll I II II I I I 
Db 568 QYYSPDLGQTflFNIPDYVASTEYRIRGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSK 627 

Qy 655 ---QDVLPTSQGVDHKQVQREL-GNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKIL 710 

II :| :: | :: | ::|::: : I : : I II I 
Db 628 PAAQVALSDKNKMDMAIAEKRLTSEQLIKLEEVKTINSTAVRLFWKKRKLEELIDGYYIK 687 

Qy 711 YR-PSGANHGESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPF— FNEFQGADSEIR 766 

:| I I : I :h : h :l III h : II I 
Db 688 WRGPPRTNDNQ — YVNVTSPSTENYVYSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSM 743 

Qy 767 FAKTLEEAPSAPPQGVTVSRNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHI 826 

I I ll'H: I : II: :lh I I II::: ::: :| = 
Db 744 DVLTAEAPPSLPPEDVRIRML-NLTTLRISWRAPKADGINGILKGFQIVIVGQAPNNNR 801 

Qy 827 NKTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVSL 886 

II: II : II I: I : III : I II :ll =1 :l 

Db 802 NITTNERAASVTLFHLVTGMTYKIRVAARSNGGVGV SHGTSEVIMNQDTL 851 

Qy 887 AQQISDWKQPAFIAGIGAACWI ILMVFSIWLYRHRKKRNGLTSTYAGIRKVP 939 

: :: : :|: |: : ||::| : : : II I I 
Db 852 ERHLAAQQENESFLYGLINKSHVPVIVIVAILIIFWIIIAYCYWRNSRNSD GKDR 907 

Qy 940 SFTFTPTVTYQRGGEAVSSGGRPGLLNISE-PAAQPWLADTWPNTGNNHNDCSISCCTAG 998 

II : . I ::| I :::: II I II I :: I 

Db 908 SF IKINDGSVHMASN— NLWDVAQNPNQNPMYNTAGRMTMNNRNGQALYSLTPN 959 

Qy 999 NGNSDSNLTTYS — RPADCIANYNNQLDNKQTNLMLPESTVYGDVDLSNRINEMRTFN 1054 

: :| II II : II II II II h :: 
Db 960 AQDFFNNCDDYSGTMHRPGSEHHYHYAQLTGGPGNAM---STFYG NQYHD 1006 
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Oy 1055 SPNLKDGRFVNPSGQPTPYATTQLIQSNLSNNMNNGSGDSGEKHWKPLGQQKQEVAPVQY 1114 

hlllll I: II II | 
Db 1007 DPSPYATTTLVLSN QQ PAWL 1026 

Qy 1115 NIVEQNKLNKDYRANDTVPPTIPYNQSYDQNTGGSYNSSDEGSS TSGS 1162 

I I : IN 1 :l | : | | | |||| 
Db 1027 N- - -DKMLRAPAMPTNPVPPEPP- -ARYADHTAGRRSRSSRASDGRGTINGGLHHRTSGS 1081 

Qy 1163 Q GHKK — GARTPKVPKQGGMNWADLLPPPPAHPPP 1195 

I I I III I : I I :|lll::lll 

Db 1082 QRSDSPPHTDVSYVQLHSSDGTGSSKERTGERRTP--PNKTLM — DFIPPPPSNPPP 1134 



RESULT 5 i 
T29549 

hypothetical protein ZK377.3 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

•Date: 15-Oct-1999 *sequence_revision 15-Oct-1999 ftext_change 18-Feb-2000 
Accession; T29549 
Nhan, M. ; Hawkins, J. 
submitted to the EMBL Data Library, February 1997 
A; Description: The sequence of C. elegans cosmid ZK377. 
AjReference number: Z20639 
A; Accession: T29549 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-423 <HHA> 

A;Cr0SS-references: EMBL:088183; PIDN:AAB52658.1; GSPDB:GN00028; CESP:ZK377.3 

A; Experimental source: strain Bristol N2; clone ZK377 

C; Genetics: 

A;Gene: CESP:ZK377.3 

A; Map position: X 

Ajlntrons: 24/1; 142/3; 229/3; 284/2; 408/3 



Query Match 9.8%; Score 852; DB 2; Length 423; 

Best Local Similarity 46.14; Pred. No. 3e-33; 

Matches 177; Conservative 55; Mismatches 144; Indels 8; Gaps 

Qy 68 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYRGGERVETDRDDPRSHRMLLPSGSL 127 

I hill h:||:| llllll h I I III h I hh lll::l :||| 
Db 30 PVIIEHPIDVVVSRGSPATLNCGAK-PSTAKITWYKDGQPVITNKEQVNSHRIVLDTGSL 88 

Qy 128 FFLRIVHGRKSR-PDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEP 186 
I lh: h : II I III I II I: ll::|:||:||| I I II 

LDb 89 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVRSNEGSLRLAMLREDFRVRPRTVQALGGEM 148 

H 187 AVMECQPPRGHPEPT ISWKKDGSPLDDKD ■ ER ITI RG "GKLMIT YTRKSDAGKYVCVGTN 244 
W Ihll Mil fll :||:M I :| I I: I hi : ! 1 : 1 I II I 

Db 149 AVLECSPPRGFPEPWSWRRDDRELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 

Qy 245 MVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDGELPK 304 

Mill I I hi hi I : I I I : I I III I : |:: : :| 
Db 209 KVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMPV 268 

Qy 305 SR-YEIRDDHTLKIRKVTAGDMGSYICVAENMVGKAEASATLTVQEPPHFWKPRDQWA 363 

:| I :!: hi :| I I I I I I I III! I II II I I I 
Db 269 TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTRPADQSVP 328 

Qy 364 LGRTVTFQCEATGNPQPAIFWRREGSQNLLF-SYQPPQSSSRFSVSQTGDLTITNVQRSD 422 

I I Ihl I I II II :ll hill II : I II II III h: I 
Db 329 AGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSY-VSADGRTKVSPTGTLTIEEVRQVD 386 

Qy 423 VGYYICQTLNVAGSIITKAYLEVT 446 

I hi :| III h I 
Db 387 EGAYVCAGMNSAGSSLSKAALKAT 410 



S05479 

neural cell adhesion molecule LI precursor • mouse 
C; Species: Mus musculus (house mouse) 

C;Date: 10-Sep-1999 fsequence_revision 10-Sep-1999 ttext_change 10-Sep-1999 



C; Accession: S05479; B60850; S22167 

R;Moos, M.; Tacke, R. ; Scherer, H.; Teplow, D.; Frueh, K.; Schachner, M. 
Nature 334, 701-703, 1988 

A;Title: Neural adhesion molecule Ll as a member of the immunoglobulin superfamily wi 
A;Reference number: S05479; MOID: 88318924 
A; Accession: S05479 : 
A; Molecule type: mRNA 
A; Residues: 1-1260 <MQ0> 

A; Cross -references: EMBL:X12875; NID:g53336; PIDN:CAA31368.1; PID:g53337 
A;Note: the authors translated the codon CCT for residue 166 as Leu, ACT for r^i.,. 
A;Note: part of this sequence, including the amino end of the mature protein, -kz - 
R;Rathjen, F.G.; Wolff, J.M.; Frank, R.; Bonhoeffer, F. ; Rutishauser, U. 
J. Cell Biol. 104, 343-353, 1987 

A; Title: Membrane glycoproteins involved in neurite fasciculation. 
A;Reference number: A60850; MDID:87109457 
A; Accession: B60850 • ' 
A;Molecule type: protein 
A;Residues: 20-28, 'XX', 31-36 <RAT> 

R;Kohl, A.; Giese, R.P.; Mohajeri, M.H.; Montag, D.; Moos, M,; Schachner, M. 
submitted to the EMBL Data Library, December 1991 

A; Description: Analysis of promoter activity and 5' genomic structure of the neural c 
A; Reference number: S22167 
A; Accession: S22167 
A;Molecule type: DNA '' 

A;Residues: 1-165,'L', 167-189,'E',191-281,'S',283-395,'S', 397-514, 'APEKNPVDV, 524, Gl 
A;Cross-references: EMBL:X63511 
C; Genetics: 

A;Introns: 26/1; 31/1; 66/2; 133/1; 174/1; 231/1; 268/2; 330/1; 374/1; 422/1; 4S:/2 
A; Note: the list of introns may be incomplete 

C; Superfamily: neural cell adhesion molecule Ll; fibronectin type III repeat homoloov 

C;Keywords: alternative splicing; cell adhesion; duplication; glycoprotein; transmemb 

F;l-19/Domain: signal, sequence Istatus predicted <SIG> 

F;20-1260/Product: neural cell adhesion molecule * status experimental <MAT> 

F;256-313/Domain: immunoglobulin homology <IMM1> 

F;440-498/Domain: immunoglobulin homology <IMM2> 

F;531-592/Domain: immunoglobulin homology <IMM3> 



Query Match 8.8%; Score 764.5; DB 1; Length 1260; 

Best Local Similarity 23.3*; Pred. No. l.Be-28; 

Matches 307; Conservative 177; Mismatches 510; Indels 325; Gaps 51; 

Qy 56 YTGSRLRQEDFPPRIVEH-PSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDK-- 112 

I I : : II I I I hi : :| 1:1 III ill:: 
Db 26 YKGHHVLE— PPVITEQSPRRLWFPIDDISLKCEARGRPQVEFRWTKDGIHFKPKEEL 82 

Qy 113 DDPRSHRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAI 167 

: I I' • : : I I :|:| I I Ml |:|| ' |: : 

Db 83 GVWHEAPYSGSFTIEGNNSFAQRF QG I YRCY ASNKLGTAMSH — E IQL 129 

Qy 168 LRDDFRQNPSD- • - -VMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDD- -KDERITI ■ 220 

: : : I :". I I II h I II III I : I :|||::: 
Db 130 VAEGAPKWPKETVKPVEVEEGESWLPCNPPPSAAPPRIYWM--NSKIFDIKQDERVSMG 187 

Qy 221 RGGKLMITYTRKSD-AGKYVCVGTNMVGER--ESEVAELTVLERPSFVKRPSNLAVTVD 276 

: I ! II hi : I I : I :| I I : I I : 
Db 188 QNGDLYFANVLTSDNHSDYIC-NAHFPGTRTIIQKEPIDLRVKPTNSMIDRKPRLLFPTN 246 

Qy 277 DSAE • -FKCEARGDPVPTVRWRKDDGELPKSRYEIRDDH- -TLKIRKVTAGD 324 

h :| I I I lh:! I I I :l ||:: I I 

Db 247 SSSRLVALQGQSLILECIAEGFPTPTIKWLHPSDPMPTDRV-IYQNHNKTLQLLNVGEED 305 

Qy 325 MGSYTCVAENMVGKAEASATLTVQEPPHFWRPRDQWALGRTVTFQCEATGNPQPAIFW 384 

.1 llhlll i\ I : :lh h:: lh : II hi III I I 
Db 306 DGEYTCLAENSLGSARHAYYVTVEAAPYWLQKPQSHLYGPGETARLDCQVQGRPQPEITW 365 

Qy 385 RREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITRAYLE 444 

I I f.: :: : I I I ::||| :| h I I :: lh 
Db 366 RING M3METVMDQKYRIEQ-GSLILSNVQPTDTMVTQCEARNQHGLLLANAYIY 419 

Qy 445 VTDVIADRPPPVIRQGPVNQT -VAVDG -TFVLSCVATGSPVPTILWRKDGVLVSTQDSRI 502 

I : I :: : III :lhl I I I I hllh: I : III 
Db 420 WQL — PARILTRD--NQTYMAVEGSTAYLLCRAFGAPVPSVQWLDEEGTTVLQDERF 473 
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Qy 


503 


Db 


474 


Qy 


556 


Db 


534 


Qy 


560 


Db 


594 


Qy 


600 


Db 


654 


Qy 


660 


Db 


714 


1 


720 


% 


759 


Qy 


780 


Db 


816 


Qy 


830 


Db 


872 


Qy 


876 


Db 


932 


Qy 


913 


Db 


972 


Qy 


964 


Db 


1020 


Qy 


1022 


Db 


1076 




1056 


I 


1136 


Qy 


1096 


Db 


1196 



Mil! : Mill I I: 



I I 



■-OPHL- 

IIM 



- 559 



534 VTFTCQASPDPSLQASITWRGDGRDLQERGDSDKYFIEDGRLVIQSLDYSDQGNYSCVAS 593 



■-IPSAPSKPEVTD— VSRNTVTLSWQPNLNSGATPTSYIIE-AF 599 

I h:l : :: I III I : : III 



:H II II I I I II ill :|: I I : I 



I I I: 



: I :| : II I" Ml I 



I llll : I: : 



: I: I I 



• -QGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKV- -WCLGNETRY — HINKT - - 

: :|: I : Ml IM I :: I I I |:: :: ||:|: 



VDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSE- ■ 

I M I :: I I I III I I I I II 
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RESULT 7 
A41060 

neural cell adhesion molecule LI precursor ■ human 
N; Alternate names: L1CAM 
C;Species: Homo sapiens (man) 

C;Date: 10-Sep-1999 fsequencejrevision 1Q-Sep-1999 ttext change 21-Jul-200Q 
C;Accession: A41060; S18454; A35331; S21971; S21972; A60223; A31072; G02506 
R;Hlavin, M.L.; Lemmon, V. 
Genomics 11, 416-423, 1991 
A;Title: Molecular structure and functional testing of human LlCAM: an interspecies comf 
A;Reference number: A41060; MUID: 92120663 
A;Accession: A41060 
A; Molecule type: mRNA 
AjResidues: 1-1257 <HLA> 

A;Cross-references: GB:M64296; NID:gl86053; PIDN: AAC14352 . 1; PID:g3068548 



R;Kobayashi, M.; Miura, M.; Asou, H.; Uyemura, K. 
Biochim. Blophys. Acta 1090, 238-240, 1991 

A;Title: Molecular cloning of cell adhesion molecule LI from human nervous tissue: a 
AjReference number: S18454; MUID: 92031698 
A; Accession: S18454 : 
A; Molecule type: mRNA" 

A;Residues: 1-3, 'V, 5-215, 'I', 217-249, 'T', 251-275, 'SV, 278-356, 'E' ( 358-625, 'V, 627-12 

A /Cross-references: EMBL:X59847; NID:g35009; PIDN:CAA42508,1; PID;g35010 

A;Note: the authors translated the codon GAA for residue 27 as Gly 

RjDjabali, M.; Mattel', M.G.; Nguyen, C; Roux, D.; Demengeot, J.; Denizot, F.; Moos, 

Genomics 7, 587-593,. 1990 

AjTitle: The gene encoding LI, a neural adhesion molecule of the immunoglobulin famil 

AjReference number: A35331; MUID:90353957 

A; Accession: A35331 •". 

A; Molecule type: DNA 

AjResidues: 332-371 <DJA> 

A; Cross -references: GB:M55271 

R; Rosenthal, A,; MacKinnon, R.N.; Jones, D.S.C. 

Nucleic Acids Res. 19, 5395-5401, 1991 

A; Title: PCR walking from microdissection clone M54 identifies three excns fror -} 
AjReference number: S21971; MDID:92020233 
A; Accession: S21971 ; ' 
AjMolecule type: DNA-'. 
AjResidues: 1082-1176 '<ROS> 

AjCross-references: EMBL:X58775; NID:g29642; PIDN:CAA41576.1; PID:g29643 
A; Accession: S21972 ■ 

A; Status: nucleic acid sequence not shown; translation not shown 

AjMolecule type: mRNA. 

AjResidues: 353-935, 'V, 937-1176 <R02> 

AjCross-references: EMBL:X58776; NID:g29644; PIDN:CAB37831.1; PID:g4467833 
RjHarper, J.R.; Prince, J.T.; Healy, P. A.; Stuart, J.K.; Nauman, S.J.; Stallcup, w.B. 
J. Neurochem. 56, 797-804, 1991 

A; Title: Isolation and sequence of partial cDNA clones of human LI: homology of human 
AjReference number: A60223; MUID:91132183 
A; Accession: A60223 ' 

A; Status: not compared with conceptual translation 
AjMolecule type: mRNA 

AjResidues: 1030-1115', 'WLC', 1118-1176, 1181-1257 <HAR> 

RjWolff, J.M.j Frank, R.j Mujoo, K.; Spiro, R.C.; Reisfeld, R.A.; Rathjen, F.G. 

J. Biol. Chem. 263, 11943-11947, 1988 

A; Title: A human brain glycoprotein related to the mouse cell adhesion molecule LI. 

AjReference number: A31072; MUID: 88298876 

A; Accession: A31072 

AjMolecule type: protein 

AjResidues: 'Q', 21-36 <W0L> 

RjPlatzer, M.j Bauer,, D.j Drescher, B. 

submitted to the EMBL'Data Library, March 1995 

AjReference number: H01368 

A; Accession: G02506 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
AjMolecule type: DNA : , 
AjResidues: 1-1257 <pla> 

AjCross-references: EMBL:U52112; NID:gl302657; PIDN : AAC51746 . 1 ; PID:gl302658 
C; Genetics: 

A; Gene: GDB: LlCAM • 

AjCross-references: GDB:120133; OMIM:303350; OMIM:308840 
A; Map position: Xq28-Xq28 

Ajlntrons; 26/1; 31/1; 66/2; 134/1; 175/1; 232/1; 269/2; 331/1; 375/1; 423/1; 460/2; 

n 

C; Super family; neural, cell adhesion molecule Ll; fibronectin type III repeat homology 

CjKeywords: alternative splicing; cell adhesion; duplication; glycoprotein; transmemb 

F;l-19/Domain: signal sequence Ustatus predicted <SIG> 

F;20-1257/Product: neural cell adhesion molecule Ll ((status predicted <MAT> 

F;257-314/Domain: immunoglobulin homology <IMM1> 

F;532-593/Doiain: immunoglobulin homology <IMM2> 



Query Match . 8.7*; Score 761; DB 1; Length 1257; 

Best Local Similarity 24.94; Pred. No, 2.6e-28; 

Matches 283; Conservative 159; Mismatches 422; Indels 274; Gaps 45; 



Qy 
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11 LLLCSP CLLIQIPEEYEGHHVMEPPVIT EQSPRRLWF 48 

74 PSDLIVSKGEPATLNCKAEGRPTPT IEWTKGGERVETDKD DPRSHRMLLPSGS 126 
hi I :| 1:1 1:1 I : I : :: II:: 

49 PTDDI SLKCEASGKPEVQFRWTRDGVHFKPKEEIOTVYQSPHSGSFTITGNN 101 

27 LFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSD VMVA 182 

I :: :|:l Mill hll |: :: : : I : II 
02 SNF AQRFQGIYRCFASNKLGTAMSH EIRLMAEGAPKWPKETVKPVEVE 149 

83 VGEPAVMECQPPRGHPEPT ISWKKDGSPIDDKDERIT I ■ RGGKLMITYTRKSD - AGKY VC 240 
II I: I II II :MI:|: : I I I |:| 

50 EGESWLPCNPPPSAEPLRIYWMNSKILHIKQDERVTMGQNGNLYFANVLTSDNHSDYIC 209 

!41 — VGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAE FKCEAR 286 

II : ft\ :l I I : I I : |: :| | 

10 HAHFPGTRTIIQKEP-- IDLRVKATNSMIDRKPRLLFPTNSSSHLVALQGQPLVLECIAE 267 

87 GDPVPTVRWRKDDGELPKSRYEIRD-DHTLKIRKVTAGDMGSYTCVAENMVGKAEASATL 345 

I I IIM i hi I :: : lh: II III |:||| :| I : : 
68 GFPIPTIKWLRPSGPMPADRVTYQNHNKTLQLLKVGEEDDGEYRCLAENSLGSARHAYYV 327 

146 TVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRF 405 

II: I::: lh : II |: I III : || I : :: 

128 TVEAAPYWLHKPQSHLYGPGETARLDCQVQGRPQPEVTWRING IPVEELAKDQRY 382 

106 SVSQTGDLTITNVQRSDVGYYICQTLNVAGSIIIKAYLEVTDVIADRPPPVIRQGPVNQT 465 

: I I I ::MI II hll :: lh I : I :: III 
183 RI ■ QRG ALI LSNVQPSDTMVTQCEARNRHGLLLANAY I YWQL — PAKILTAD ■ • NQT 435 

■VAVDG -TFVLSCVATGSPVPTILW-RKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGR 522 
:H I I Ml 1 : 1 1 1 : : I :|| I II I till: II 
;36 YHAVQGSTAYLLCKAFGAPVPSVQWLDEDGTTV-LQDERFFPYANGTLGIRDLQANDTGR 494 

123 YICIASIPSGEATWSAYIEVQEFGVPVQPPRPT DPHLIPS — 562 

I hh I I ::h: I II I lh II 

: 95 YFCLAANDQNNVTIMANLKVKDATQITQGPRSTIEKKGSRVTFTCQASFDPSLOPSITWR 554 

'63 APSKPEVTD 571 

I :: :| : 

155 GDGRDLQELGDSDKYFIEDGRLVIHSLDYSDQGNYSCVASTELDWESRAQLLWGSPGP 614 

172 -- VSRNTVTLSWQPNLNSGATPTSYIIE-AFSHASGSSWQTVAENVKTETS 619 

:::: Mil : I I II : I :: : :|| 
il5 VPRLVLSDLHLLIQSQVRVSWSPAEDHNAPIEKYDIEFEDREMAPEKWYSLGKVPGKQTS 674 

120 AIKGLKPNAIYLFLVRMNAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQRELGNAVLH 679 

II I I I II II :M :h I I : I III I |; 
175 TTLKLSPYVHYTFRVTAINKYGPGEPSPVSETWTPEAAPEKNPVDVKGEGNETTNMVI" 733 

LHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHGESDWLVFEVRTPAKNSWIP 739 
: : I :| : :| I:: :|| I I II :|: 

34 T WKPLRW -MDWNAPQVQ - YRVQWRPQGT • "RGPWQEQIVSDP- *FLWS 776 

40 DLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPP ■ -QGVTVSKNDGNGTAILVS 797 

I llll : :: :| : :: : |: I I I :|: : I :|:|| 
77 NTSTFVPYEIKVQAVNSQGKGPEPQVTIGYSGEDYPQAIPELEGIEIL NSSAVLVK 832 



Qy 798 WQPPPEDTQNGMVQEYRV--WCLGNETRY" -HINK- "TVDGSTFSWIPFLVPGIRYS 849 

hi I :: I I I h: :: Ihl I :| lh: I | | 
Db 833 WRPVDLAQVKGHLRGYNVTYWREGSQRKHSKRHIHKDHWVPANTTSVILSGLRPYSSYH 892 



Qy 



850 VEVAASTGAGSGVKSE PQFIQLDAHGN PVS 879 

:|| I I III II I: : I: I hi 

893 LEVQAFNGRGSGPASEFTFSTPEGVPGHPEALHLECQSNTSLLLRWQPPLSHNGVLTGYV 952 

880 — PEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIWLYRHRKKRNGLTSTYAGI 935 

I h hi :: I : : || : 

953 LSYHPLDEGGKG -QLSFNLRDPEL RTHNLTDLSPHL 987 



■ Qy 936 RKVPSFTFTPTVTYQRG-GEA-VSSGGRPGLL NI SEPAAQPWLADTW - PNTG 984 

I : I 1:11111111 III I : : :| M 
Db 988 R — YRFQLQATTKEGPGEAIVREGGTMALSGISDFGNISATAGENYSWSWVPKEG 1041 



RESULT 8 
150600 

neogenln • chicken (fragment) 
C; Species: Gallus gallus (chicken) 

C;Date: 13-Sep-1996 *sequence_revision 13-Sep-1996 ttext.change 21-M-2000 
C;Accession: 150600. 

R;Vielmetter, J.; Kayyem, J.F.; Roman, J.M.; Dreyer, W.J. 
J. Cell Biol. 127, 2009-2020, 1994 

A; Title: Neogenin, an avian cell surface protein expressed during terminal neuronal 
A;Reference number: :A55193; MDID:95105243 
A; Accession: 150600' 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A;Residues: 1-1443 <VIE> 

A;Cross-references: EMBL:007644; HID:g641965; PIDN:AAC59662.1; PlD:g641966 



Query Match 8.7*; Score 757.5; DB 2; Length 1443; 

Best Local Similarity 23. 21; Pred. No. 4.5e-28; 

Matches 378; Conservative 207; Mismatches 681; Indels 361; Gaps 64; 

Qy 57 TGSRLRQEDFPP-RIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDP 115 

III :| M : I h: :| :|| : I III I I : II 
Db 9 TGSWR--TFTPFYFLVEPMDILSVRGASVIMNCSSYCETPPKIEWKKDGTLLNLVSDD- 65 

Qy 116 RSHRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVAR-NYLGEAVSHNASLEVAILRDDFRQ 174 

I III III :ll : -Mil I III llll I I II I I 
Db 66 "RRQLLPDGSLLINSWHSKHNKPDEGYYQCVATVESLGSIVSRTAKLTVAGL-PRFTS 122 

Qy 175 NPSDVMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITIRGGKLMITYTRKSD 234 

I I' !■ h: h I : h:| II I : I |:| :| 

Db 123 QPELSSVYKGNSAILNCE-VNVDLAPFVRWEQDRQPLSLDDRVFKLPSGALLIGNATDTD 181 

Qy 235 AGKYVCVGTNMVGERESEVAELTVLERPS FVKRPSNLAVTVDDSAEFKCEARGD 288 

Mil: Ml III :| I ll::lhl :| I II I 
Db 182 GGFYRCVIES3GTPKYSEEAELKILPDPEEPQSLVFVRQPSSLTKVTGQNAVFPCVAGGF 241 

Qy 289 PVPTVRWRKDDGEL—PKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATL 345 

I I III h II h :| :| I II hhllhhl II I I 
Db 242 PTPYVRWTKN3EELITEDSERFALRAGGSLLISDVTEEDVGTYTCIADNENETIEAQAEL 301 

Qy 346 TVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRF 405 

II II hM : : hll II II : I : I : I I 

Db 302 AVQVPPEFLKRPANIYAHESMDIVFECEVTGKPTPTVKWVKNGDWIPSDY F 353 

Qy 406 SVSQTGDLT ITNVQRSDVGY Y ICQTLNYAGS I ITKAYLEVTDVI ADRP • • PPVIRQGPVN 463 

: : :l : :|| hi I I h II : |: III | 
Db 354 KIVKEHSLQVLGLVKSDEGFYQCIAENDVGNAQAGAQLIILDLDVAIPTLPPTSLTSATN 413 

Qy 464 QTVAVDGTFVLSCVATGSPVPTILMRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRY 523 

:| , II hll llll h :| 
Db 414 DHLA PATTGPLPTAPRDWATLVST RFIRL 443 

Qy 524 TC I AST PSGEATWSAY IEVQEFGVPVQPPRPTDPNLI PS APSKPEVTDVSRNTVTLSWQP 583 

■] II II hill I ::| I : :| 

Db 444 "TWR TPVSDPQ- - GDNLTYS IFYTKE - -GINRERVENTSRP 479 

Qy 584 NLNSGATPTSYIIEAFSHASGSSWQTVAENVKTETSA-IKGLKPNAIYLFLVRAANAYGI 542 

II IMI :|:| II I :| 
Db 480 G - ETQVMIQNLMPETVYVFRWAQNKHGH 507 

Qy 643 SDPSQISDPVKTQDVLPTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHW-TVDQQS 701 

: I hi Mil: I : :ll IM I I : 
Db 508 GESSA— PLK — VATQPEV- -QLPGPAPNIRAYAGSPT SVTVTWETPLSGN 552 

Qy . 702 iQYIQGYKILYSPSGANHGESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGA 761 

fc, '* i - II MM I : I I I I I I hi I :: I 

Db 553 GEIQNYKLYYMEKGQD- SEQDVDV AGLSYT ITGLKKYT EY SFRWAYNKHGPGV 605 

Qy 762 DSEIKFAKTLSEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNE 821 

" :|| : III II M I :h: Mill I :| : lh 
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Db 606 STQDVWRTLSDVPSAAPQNLTLEAR--NSRSIMLHWQPPPAGTHSGQITGYKIRYRKVS 663 

Qy 822 TRYHINKTVDGSTFSWIPFLVPGIRYSVEVMSTGAGSG VKSEPQFIQLD — 872 

: : ::| |: :| I I I: :|| I hi I :| II 
Db 664 RKSDVTESVGGTQLFQLIEGLERGTEYNFRIAAMTVTIGTGPATDWVSAETFESDLDESRV 723 

Qy 873 AHGNPV SPEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIW 917 

! I: 11:1 : II: I 111= : I 
Db 724 PEVPSSLHVRPLVTSIWSWTPPENQ NIWRGYAIGYGIGSPHAQTIKVD— 773 

Qy 918 LYRHRKKRNGLTSTYAGIRKV-PSFTFTPTV-TYQRGGEAV— -SSGGRP GL 964 

1:1 I I : II : |: II : h II I 

Db 774 -YKQR YYTIEHLDPSSHYVITLKAFNNVGEGIPLYESAVTRPHSDTSEVDL 823 

Qy 965 LNISEP ' AAQPWLADTWPNTG NNHNDCSISCCTAGNGNSDSNLTTYSRPADCIAN 1018 

I: I I : ' I I :|: h :|::| : I I 

Db 824 FVINAPYTPVPDPSPMMPPVGVQASILSHDTIRITW ADNSLPKNQKITD - - AR 874 

Qy 1*019 YNNQLDNKQTNLMLPESTVYGDVDLSN KINEMRIFNSPNLKDGRFVNPSGQP 1070 

I :M^I :| I : : I I : h II:: 

.Db 875 YYTV-RWKTN-IPANTKYKTANA'TTLSYLVTGLKPNTLYEFSVMVTKGRR-SSTWSM 928 

■y 1071 TPYATT-QLIQSNLSNNMNNGSGDSGEK — HWKPLGQQKQEVAP - VQY NIVEQNKLNK 1124 
r | : || :|: :: :: | : : :|:| : :: : I : I 

Db 929 TAHGTTFELVPTSPPKDVTWSKEGKPRTIIVNWQPPSEANGKITGYIIYYSTDVNAEIH 988 

Qy 1125 DYRANDTVPPTIPYN-QSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKVPKQGGMNWA 1183 

I: I : : I :| : I I I |||| : 
Db 989 DWVIEPWGNRLTHQIQELTLDTPYYFRIQARNSRGMGPMSEAVQFRTPRAES S 1042 

Qy 1184 DLLPPP PAHPPPHSNSEEYNISVDESYDQEMPCP 1217 

I :| I : II I I : I II 

Db 1043 DKMPNDQASGSAGKGSRPVDVGPDYKPPLSGSNSPHGSPTSPLDSNMLLVIIVSVGVITI 1102 

Qy 1218 --VPPARMYLQQDELEEEEDERGPIPPVRGAASSPAAVSYSHQSTATLTPSPQEELQPML 1275 

I II |: :: : I : 

Db 1103 VIWIVAVFCTRRTTSHQRRKRAACRSVNG SHRYRGNSRDVTPPDLWI — 1150 

Qy 1276 QDCPEETGHMQHQPDRRRQPVSPPPPPRPI— SPPHTYGYISGPLVSDMDTDAPE— ■ 1328 

I :|: I I II :| I: : II- : 

Db 1151 HHERLELKPIDRSPDPNPIMTDTPIPRNSQDITPVDNSMDSNIHQRRNS 1199 



Qy 1329 EEEDEADMEVAKMQTRRLLLRGLEQTPASSV GDLES-- 1364 

III : I :: : I I III 
Db 1200 YRGHESEDSMSTLAGRRGMRPRMMMPFDSQPPQPVISAHPIHSLDNPHHHFHSGSLASPT 1259 

Qy 1365 -SVTGSMINGW- 'GsksEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAAE- - -YAGLKV 1418 

I :: I |:|: : :: II :: I I :| I : :l : 

Db 1260 RSYLHHQVSPWPVGISMSHSDRANSTESVRNTPSSDTMPASSSQPCADHQDPDSSSGAYL 1319 

■y 1419 ARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAAVMQKTRPAKKLKHQPGHLRRETYTD 1478 
W I MM :: I :: : I I I 

Db 1320 GSAQEEDAA QSLPTAHVRPSHPLKSFAVPAVPAAGSAYDP 1359 

Qy 1479 DLPPPPV PPPAIKSPTAQSKTQLEVR- - -PVWPKLPSMDARTDRSSDRKGS 1527 

II I: I ::| II I I Hill I : II : I 

Db 1360 TLPSTPLLTQQAPSHPVHSVK--TASIGTLGRTRPPMPVWPSAPDVQ-ETTRMLEDSES 1416 

Qy 1528 SYKGREV 1534 

II: I: 
Db 1417 SYEPDEL 1423 



RESULT 9 
S36126 

neural cell adhesion molecule Ll ■ rat 
N; Alternate names: nerve growth factor-inducible large external glycoprotein; NILE glycc 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 13-Jan-1995 isequence revision 13-Jan-1995 ftext.change 2Q-Aug-1999 
C;Accession: S36126; S17655; A60917; A30326 
R;Miura, M.; Kobayashi, M.; Asou, H.; Uyemura, R. 
FEBS Lett, 289, 91-95, 1991 
A;Title: Molecular cloning of cDNA encoding the rat neural cell adhesion molecule Ll. T\ 



A;Reference number: S17655; MtllD: 91372414 

A; Accession: S36126 

A; Status: preliminary 

A; Molecule type: mRNA 

A; Residues: 1-1259 <MIO> 

A; Cross -references: EMBL:X59149 

A; Accession: S17655 * ; 

A; Status: preliminary 

A; Molecule type: mRNA 

A;Residues: 1-1178,1183-1259 <MI2> 

A; Cross -references: EMBL:X59149; NID:g56740; PIDN:CAA41860.1; PID:g56741 
R;Prince, J.I.; Milona, N.; stallcup, W.B. 
J. Neurosci. 9, 1825*1834, 1989 

A; Title: Characterization of a partial cDNA clone for the NILE glycoprotein and ident 
A;Reference number: 4 A60917; MOID: 89257627 
A;Accession: A60917. 

A; Status: not compared with conceptual translation 
A; Molecule type: mRNA 

A;Residues: U59-1199,'G',1201-1235,'K',1237 <PRI> 

A;Note: this paper appeared earlier, with printing errors, as reference A30326 

R;Prince, J.T.; Milona, N.; Stallcup, W.B. 

J. Neurosci. 9, 876-883, 1989 

A;Title: Characterization of a partial cDNA clone for the NILE glycoprotein and ident 
A;Reference number: A30326; MOID: 89177485 
A; Contents: annotation 

A; Note: this paper was reprinted as reference A60917 to correct the omission of sever 
C;Comment: This sequence of this surface-accessible glycoprotein differs at only two 
accessible only after treatment of cells with detergent and is assumed to be cytopla 
C'Superfamily: neural cell adhesion molecule Ll; fibronectin type III repeat homology 
C;Keywords: cell adhesion; duplication; glycoprotein; membrane protein 
F;531-592/Domain: immunoglobulin homology <IMM> 



Query Match 8.5%; Score 739,5; DB 2; Length 1259; 

Best Local Similarity 23.74; Pred. No. 2.6e-27; 

Matches 273; Conservative 159; Mismatches 435; Indels 287; Gaps 42; 

Qy 1 MKWKHVPFLVMISLLSLSPNHLFLAQLIPDPEDVERGNDHGTPIPTSDNDDNSLGYTGSR 60 

I I :| I:: I : III II 

Db 4 MLWYVLPLLLCSPCLLIQ IPDE YKGHH 30 

Qy 61 LRQEDFPPRIVEH-PSDLIVSRGBPATLNCKAEGRPTPTIEWYRGGERVETDK 112 

: : II 1 II 1:1 : :| hi III III:: 
IDb 31 VLE- - -PPVITEQSPRRLWFPTDDISLKCEARGRPQVEFRWTKDGIHFKPKEELGVWH 87 

Qy 113 DDPRSHRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDF 172 

: I I : : I I :|:| I I I II hll h :: : 
Db 88 EAPYSGSFTIEGNNSFAQRF QGIYRCYASNNLGTAMSH EIQLVAEGA 134 

Qy 173 RQNPSD— -VMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITI-RGGKLMI 227 

: I : I I II |: III II :llh:: : I I 

Db 135 PKWPKETVRPVEVEEGESWLPCNPPPSAAPLRIYWMNSKILHIKQDERVSMGQNGDLYF 194 

Qy 228 TYTRKSD-AGEYVCVGTNMVGER- - -ESEVAELTVLERPSFVKRPSNLAVTVDDSAE- • - 280 

II hi : I I : I :| I I : I I : h 
Db 195 ANVLTSDNHSDYIC-NAHFPGTRTIIQKEPIDLRVKPTNSMIDRRPRLLFPTNSSSHLVA 253 

Qy 281 FKCEARGDPVPTVRWRKDDGELPKSRYEIRDDH-TLKIRKVTAGDMGSYTCV 331 

:'M I I lh:| :| II :| lh: I I I Mh 
Db 254 LQGQSLILECIAEGFPTPTIKWLHPSDPMPTDRV-IYQNHNKTLQLLNVGEEDDGEYTCL 312 

Qy 332 AENMVGKAEASATLTVQEPPHFWRPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQN 391 

III :| I! : :||: |::: lh : I I hi III = II I 
Db 313 AENSLGSARHMYVTVEAAPYWLQKPQSHLYGPGETARLDCQVQGRPQPEVTWRING--- 363 

Qy 392 LLFSYQPPQSSSRFSVSQTGDLTITNVQRSDVGYY ICQTLNVA6SI ITKAYLEVTDVI AD 451 

I : ; :: : I I I ::IM II |: I I :: |h I : 
Db 370 -MSIERVEDQKYRIEQ-GSLILSNVQPSDTMVTQCEARNQHGLLLANAYIYWQL--- 423 

Qy 452 RPPPVIRQGPTOQT-VAVDG-TFVLSCVATGSPVPTILWRRDGVLVSTQDSRIKQLENGV 509 

I :: : III :|hl I I I I Mil:: | : Ml II 
Db 424 -PARILTKD-NQTYMAVEGSTAYLLCKAFGAPVPSVQWLDEEGTTVLQDERFFPYANGH 480 
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Qy 510 LQ I RYAKLGDTG RYT C I ASTPSGEATWS AY IEVQEFGVPVQPPRPT 555 

I II : Mill II: II I II I 

Db 481 LG I RDLQ ANDTGRYFCQAANDQNNVT I LANLQVKEATQI TQG PRS T I EKKG ARVTFTCQA 540 

Qy 556 --DPNL 559 

11:1 

Db 541 SFDPSLQASITWRGDGRDLQERGDSDKYFIEDGQLVIKSLDYSDQGDYSCVASTELDEVE 600 

Qy 560 IPSAPSKPEVTD- - -VSRNTVTLSWQPNLNSGATPTSYIIE-AFSHASGSS 606 

I I I::| : :: I III I : : I II : 

Db 601 SRAQLLVVGSPGPVPHLELSDRHLLKQSQVHLSWSPAEDHNSPIEKYDIEFEDKEMAPEK 660 

Qy 607 WQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTSQGVDH 666 

: I : :ll I I I I I I I II :|| :|: I I : I I 

Db 661 WFSLGKVPGNQTSTTLKLSPYVHYTFRVTAINRYGPGEPSPVSETWTPEAAPEKNPVDV 720 

Qy 667 KQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANBGESDWLVF 726 
III: : : I :| : || |:: :|| | : | 

721 RGEGNETNNMVI TWKPLRW-MDWNAPQIQ -YRVQWRPLGK- ■ -QETW— - 762 

Qy 727 EVRTPAKNSWIPDLRRGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPP--QGVTV 784 

: :| : :|: : I Mil : |: :| : :: : |: I I : :|: 
Db 763 KEQTVSDPFLWSNTSTFVPYEIKVQAVNNQGKGPEPQVTIGYSGEDYPQVSPELEDITI 822 

Qy 785 SKNDGNGTAILVSWQPPPEDTQNGMVQEYKV--WCLGNETRY— HINKT— VDGSTFS 836 

I : :ll 1:1 I :: I I I |:: :: |::|: I :| I 
Db 823 F NSSTVLVRWRPVDLAQVKGBLRGYNVTYWWKGSQRKHSKRHVHKSHMWPANTTS 878 

Qy 837 WI PFLVPG IRYSVEVAASTGAGSGVKSE PQFIQLDAHG 875 

:: I I I III I I I I II I: : I: 

Db 879 AILSGLRPYSSYHVEVQAFNGRGLGPASEWTFSTPEGVPGHPEALHLECQSDTSLLLHWQ 938 

Qy 876 NPVSPEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIWLY 919 

:|: I : I :|| 

Db 939 PPLSHNGVLTGYLLSYHPLDGESKEQLFFNLSD 971 

Qy 920 RHRKKRNGLTSTYAG IRKVPSFTF -TPTVTYQRGGEAV SSGGRPGLLNISEPA 971 

: : II: :: : I |:| III: : |:| III I 
Db 972 -PELRTHNLTNLNPDLQ — YRFQLQATTHQGPGEAIVREGGTMALFGKPDFGNISVTA 1026 

Qy 972 AQPWLADTW-PNTG 984 

: : :| I I 

Db 1027 GENYSWSWVPREG 1040 i 



.RESULT 10 
Jfcl669 

^Hnor suppressor ■ African clawed frog 
x; Species: Xenopus laevis (African clawed frog) 
C;Date: 13-Sep-1996 tsequence_revision 13-Sep-1996 ftext_change 2Wul-2Q0Q 
C; Accession: 151669 

R;Pierceall, W.E.; Reale, MX; Candia, A.F.; Wright, C.V.; Cho, K.R.; Fearon, E.R. 
Dev. Biol. 166, 654-665, 1994 
A;Title: Expression of a homologue of the deleted in colorectal cancer (DCC) gene in the 
A;Reference number: 151668; MUID: 95113183 
A; Accession: 151669 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Moltcule type: mRNA 
A;Residues: 1-1427 <P$> 1 

A;Cross -references: EMBL:U10986; NID:g606873; PIDN: AAA70168 . 1 ; PID:g606874 
C; Genetics: 
A; Gene: XDCCa 



Query Match 8.3%; Score 726.5; DB 2; Length 1427; 

Best Local Similarity 20.6%; Pred. No, 1.3e-26; 

Matches 348; Conservative 201; Mismatches 584; Indels 553; Gaps, 55; 

Qy 71 VEHPSDLIVSKGEPATLNCKAEG-RPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSLFF 129 

: III : :l III |: I Ihl II : I: I Mill 
Db 43 LSEPSDAVTMRGGNWLNCSAQSDRGAPIIKWKKDGVYLNLVIDE— RRQQLPSGSFLI 99 

Qy 130 LRIVHGRKSRPDEGVYVCVAR-NYLGEAVSHNASLEVA-— ILRDDFRQNPSDVMVAVG 184 



Ml I III.1III I I : :| III : II II 
100 QNWHSRHHRPDEGVYQCEASLDSVGTIVSRTAKVLVAGPLRIL- - 



I II 
■-SQTESVTAFVG 154 



Qy 185 EPAVMECQPPRGHPEPTISWKKDGSPLDDKDERIT IRGGKLMITYTRKSDAG 236 

: I:: |: I I MM: :::| ::| : | | |: : :| | 

Db 155 DTALLRCE-ITGEPMPTISWQK NEEDLKVTPGDPRLLVLPSGTLQISRLQTADGG 208 

Qy 237 KYVCVGTNMVGERESEVAELTVLERPS FVKRPSNLAVTVDDSAEFKCEARGDPV 290 

I I: I .1 III :| |::|l||: I :| I I 

Db 209 VYRCLAKNPGSARVGNEAELRILSESGLHRQQVFLQRPSNWAIEGQDAVLECAVSGYPT 268 

Qy 291 PTVRWRKDDGELP--KSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQ 348 

II: I : I ::1 :| : I I III hlllll II III 

Db 269 PTIVWMQGDEPVPIRTRKYSVLGGSNLLISNVTDDDAGAYTCVAIYKNENTSFSADLTVM 328 

Qy 349 EPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVS 408 

111:1: : 1:1 :| I I : I : I : I I : 

Db 329 VPPQFLNHPAfJLYAYESMDIEFECAVSGKPSPTVKWTKNGEWIPSDY FQIV 380 

Qy 409 QTGDLTITNVQRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAV 468 

:| I : :'ll III I I Ihl I I I : I 
Db 381 DGSNLRILGLVKSDEGYYQCIAENEAGNIQTYAQLIIPD 419 

Qy 469 DGIFVLSCVAIGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIAS 528 

Db 420 - »:i 

Qy 529 TPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSG 588 

I :::INI II I llhl : I 
Db 420 PAVPSSSILPSAPRDWPVLVSSRFVRLSWRPPVESK 456 

Qy 589 ATPTS YI I EAFSHASGS SWQTVAENVKTETS AIKGLKPNAIYLFLVRAANAYG 641 

:l : ' Ml II : I I I I I I I :| 

Db 457 GNIQTYTVY- - - FSKQGVQRERAVNTSQPISLQITVGNLTPEETYNFRWAYNEWG 509 

Qy 642 ISDPSQISDPVK--IQDVLPTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQ 699 

I : I II II I I:: II :|| ; :: 
Db 510 - - - PGES SQE VKWTQPELQVPG PVENLQWST APT SVLI SWDPPAYANGP 557 

Qy 700 QSQYIQGYKILYRP-SGANHG-ESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFN 756 

:|||:: ' II I I :|: : III I |: : 

Db 558 — -VQGYRLFCAETFSGREQNIEVDGIVYR LEGLRKFTEYSIRVLAYNR 603 

Qy 757 EFQGADSEIKFAKTLEEAPSAPPQGVTVSRNDGNGTAILVSWQPPPEDTQNGMVQEYKVW 816 

III : II : III II I:: I :| III III MM : l|: 
Db 604 YGPGVSSEEHTWTLSDVPSAMPQNVSLEV-ANSRSIKVSWLPPPPGTQNGFITGYKIR 661 

Qy 817 CLGNETRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKS EPQFIQL 871 

I : :h: : MiM Ml I I 

Db 662 HRKTTRRGEL-ETLEPNNLWYLFTGLEKGSQYSFQVAAMTVNGTGPSSDWYTAETPENDL 720 

-Qy 872 D -AHGNPVSPEDQVSLAQQISDWKQPAFIAGIGAAC 907 

I M I:: :| :: : :| I I 

Db 721 DESQVPDQPS3LHVRPLTTSIIMSWTPPLNPNIWRGYIIGYGVGSPYAETVRVDSKQRY 780 

Qy 908 -WIILMVFS IWLYRHRKKRN GLTSTYA 933 

I I I: : II I: : I: 

Db 781 YSIENLEPSSHYVISLKAFNNAGEGVPLYESATTRSQTVPDMSTPMLPPVGVQAVALTHD 840 

Qy 934 GIR - KVPSFTFTPTVTYQRGGEAVS SGGRPG-LLNI 967 

:| .', :| :l :| : I :| :| : 

Db 841 AVRVSWADNSVLKNQKTTEVRFYTIRWRTSYSASSKYKSADTTSLSHTVTGLKPNTMYEF 900 



Qy 

Db 

Qy 1025 



968 SEPAAQPWLADTWPNTGNNHNDCSISCCIAGNGNSDSNLTTYSR— PADCIANYNNQLD 1024 

I : : ll I : : :: :|| :| I I :: :: 
901 SVMVTKGRRSSTWSMTAH ATTYETAPTSAPKDLTVITRERKPRAVIVSWQPPIE 954 



— rNKQTNLMLPE — STVYGDVDLSNKINEMKT — FNSPNLKDGRFVN 1065 
- Ill: I: II l:::l :: : :: : : 
Db 955 ANGKIIDFILFYTLDRNLQLDDWIMVTITGD-RLTHEILDLNLDIAYYFRIQARNAKGLG 1013 

Qy 1066 PSGQPTPYATTQLIQSNLSN1MNNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKD 1125 
I :| : I':: : I II I :| I |::::: II: 
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Db 1014 PLSEPI TFRTPKVEHPDKMANDQGRHGDGG - - YWS - - 



--VDTNLIDRSSLNEP 1060 



Qy 1126 YRANDTVPPTIPYNQSYDQN TGGSY NSSDRGSSTSGS 1162 

II : I II: :: I : I 

Db 1061 -PIGQMHPPHGSVTPQKNSNLLVIIWTVGAITILVWIVAVICTRRSSAQQRKKRATHS 1119 

Qy 1163 QGHRKGARTPKVPRQGGMNWADLLPPPPAHPPPHSNSEEYNISVDESYDQEMPCPVPPAR 1222 

I :lh: II II 
Db 1120 AGKRKGSQ KDLRPPD 1134 

Qy 1223 MYLQQDELEEEEDERGPTPPVRGAASSPAAVSYSHQSTATLTPSPQEELQPMLQDCPEET 1282 

::: :|:| : |: :| I II :|| : : I : I 

Db 1135 LWIHHEEMEMKNIEKPSGSDTQGRDS PRQSCQDITPVSHSQSESQLGS-KST 1185 

Qy 1283 GHMQHQPDRRRQPVSPPPPPRPISPPHTYGYISGPLVSDMDTDAPEEEEDEADME— VA 1339 

I llllll:: :| | 
Db 1186 SH SGP DADEVGSNISTLERTLAA 1208 

Qy 1340 RMQTRRLLLRGLEQTPASS VGDLESS V 1366 
: II I: :: hh I III: 

< 1209 RRATRAKLMIPMDSQPSSNPPWSAIPVPTLESAQYPGILPSPTCGYPHPQFTLRPVPFP 1268 
1367 TGSMINGWGSASEEDNISSGRSSVSSSDGSFFTDADPAQAVAAAAEYAGLKVARRQMQDA 1426 
j I :: 1:1:: : :| =111 I : :|| 

Db 1269 ILTVDRGFGTSRVTEVPASQQSSVLSHP QPEHSISEDA 1306 

Qy 1427 AGRRHFHASQCPRPTSPVSTDSNMSAAVMQKTRPAKKLKHQPGHLRRETYTDDLPPPP- 1484 

I : I III I: : I III 

Db 1307 PSRT--IPTACVRPTHPL RSFANPLLPPPMT 1335 

Qy 1485 VPPPAIKSPTAQSRTQLEVR PWVPKLPSMDARTDRSSDR 1524 

II : I I :: :|: Mill::: :| 

Db 1336 AMEPKVPYTTLLSQTGSGLSKAQVKTASLGLAGKARSPLLPVSVPTAPEVSEESHKHTDD 1395 

Qy 1525 KGSSYK 1530 

I I: 

Db 1396 PSSVYE 1401 



RESULT 11 
A43425 

Bravo/Nr-CAM cell adhesion molecule LI homolog • chicken (fragment) 
C; Species: Gallus gallus (chicken) 

C;Date: 27*Apr-1993 tsequencejrevision 18-Nov-1994 ftext_change 16-Jul-1999 
C;Accession: A43425 

R;Kayyem, J.F.; Roman, J.M.; de la Rosa, E.J.; Schwarz, U.; Dreyer, W.J. 
J. Cell Biol. 118, 1259-1270, 1992 

A;Title: Bravo/Nr-CAM is closely related to the cell adhesion molelules Ll and l 
A; Reference number: A43425; MUID: 92381110 

•Accession: A43425 
Status: preliminary; not compared with conceptual translation 
.Molecule type: nucleic acid; protein 
A; Residues: 1-1259 <KAY> 
A; Experimental source: cerebellum 
A;Note: sequence extracted from NCBI backbone (NCBIP:112026) 
C; Super family: neural cell adhesion molecule Ll; fibronectin type III repeat homology; 
F;237-294/Domain: immunoglobulin homology <IMM> 



Query Match 8.2%; Score 713; DB 2; Length 1259; 

Best Local Similarity 22.9%; Pred. No. 4.7e-26; 

Matqhes 298; Conservative 190; Mismatches 513; Indels 302; Gaps 50; 

Qy 59 SRLRQE-DFPPRfVEH-PSDLIVSKGEPATLNCKAEGRPTPTIEWYRGGERVETDKDDPR 116 

hi :| 111:1111 I : hhhl I: I : I : III 
Db 7 SKLLEELSQPPIITQQSPKDYIVDPRENIVIQCEAKGKPPPSFSWTRNGTHFDIDKD— 63 

Qy 117 SHRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNP 176 

: : I: : Ml I: Mil I III I IMI : = 
Db 64 AQVTMKPNSGTLWNIMNGVKAEAYEGVYQCTARNEFGAAISNNIVIXXSXSPLWTKEKL 123 

Qy 177 SDVMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITIRG-GKLMITYTRKSD 234 
I I: I: hll Mill: : II:: :| I I : : I 



Db 124 EPNHVREGDSLVLNCRPPVGLPPPIIFWMDNAFQRLPQSERVS-QGLNGDLYFSNVQPED 182 

Qy 235 AGK-YVCVG--TNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAE F 281 

: |:| :.::::: : | r|| | : :: 
Db 183 TREDYICYARFNETIQQKQKQPISVKVFSTKPVTERPPVLLTPMGSTSNKVELRGNVLLL 242 

Qy 282 KCEARGDPVPTVRWRKDDGELPKSRYEIRD-DHTLKIRKVTAGDMGSYTCVAENMVGKAE 340 

:| I I I IMI I: Mil :| : III |: I |:| I I I :| 
Db 243 ECIAAGLPTPVIRWIKEGGELPANRTFFENFKKTLKIIDVSEADSGNYKCTARNTLGSTH 302 

Qy 341 ASATLTVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQ 400 

::||: !::: II: |:: I III ll|:|:| I I : h 
Db 303 HVISVTVKAAPYWITAPRNLVLSPGEDGTLICRANGNPKPSISWLTNGVPIAI — APE 358 

Qy 401 SSSRFSVSQTGDLTI-TNVQRSDVGYYICQTLNVAGSIITRAYLEVTDVIADRPPPVIRQ 459 

II II I : ll MM:: :: :|:|: || :: 
Db 359 DPSR— KVDGDTIIFSAVQERSSAVYQCNASNEYGYLLANAFV— NVLAE-PPRILT- 410 

Qy 460 GPVNQ--TVAVDGTFVLSCVATGSPVPTILWRKDGVLVS-TQDSRIKQLENGVLQIRYAK 516 

I I: I I. :: I Ml II : II I : : M hi h 
Db 411 -PANKLYQVIADSPALIDCAYFGSPKPEIEWFR-GVKGSILRGNEYVFHDNGILEIPVAQ 468 

Qy 517 LGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPR PTDPNLI 560 

II III:!; I: :lh: : :: h II II 

Db 469 KDSTGIYTCVARNKLGKTQNEVQLEVKDPTMIIKQPQYKVIQRSAQASFECVIKHDPTLI 528 

Qy 561 P - SA 563 

I :l 
Db 529 PTVIWLKDNNRLPDDERFLVGRDNLT IMNVTDKDDGT YTC IVNTTLDSVSASAVLT WAA 583 

Qy 564 PSRP EVTDVSRNTVTLSWQPNLHSGATPTSYIIEAFS--HASGSSWQ 608 

I I '• hi :: III I : : MM I I I 

Db 589 PPTPAIIYARPNPPLDLELTGQLERSIELSWVPGEENNSPITNFVIEYEDGLHEPG-VWH 647 

Qy 609 TVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTSQGVDHKQ 668 

I . ::h ' I I I I I I I I Ml: I: I: : : 

Db 648 YQIEVPGSQTTVQLRLSPYVNYSFRVIAVNEIGRSQPSEPSEQYLTKSANPDENPSNVQG 707 

Qy 669 VQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYRILYRPSGANHGESDWLVFEV 728 

: I I M : I: : lh :| : :l 

Db 708 IGSEPDNLVITWESLRGFQSNGPGLQ YKVSWRQKDV— DDEW 747 

Qy 729 RTPARNSWIPDLRR GVNYEIRARPFFNEFQGADSEIRFARTLEEAPSAPPQ 780 

III:-:-: I I Ml : : : : hi I 

Db 748 TSVWANVSRYIVSGTPTFVPYEIRVQALNDLGYAPEPSEVIGHSGEDLPMVAPG 802 

Qy 781 GVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVW CLGNETRYHINK— TVDG 832 

1:11 II II : I : I III: I :: h I II 
Db 803 NVQV-HVINSTLARVHWDPVPLRSVRGHLQGYKVYYWKVQSLSRRSRRHVERRILTFRG 860 

Qy 833 STFSWIPFLVPGIRYSVEVAASTGAG SGVKSEPQFIQ LDA- 873 

: ::| M- I M II III I M lh 

Db 861 NKTFGMLPGLEPYSSYKLNVRWNGKGEGPASPDRVFRTPEGVPSPPSFLRITNPTLDSL 920 

Qy 874 — HGNPVSPE"-- DQVSLAQQISDV VRQPAFIAGIGAACWIILMVFSIWL 918 

1:1 I * : I h: :: II : :: : 

Db 921 TLEWGSPTHPNGVLTSYILKFQPINNTHELGPLVEIRIPANESS LILNNLN 971 

Qy 919 YRHRKR-RNGLTSTYAGIRRVPSFTFTPTVTYQRG--GEAVSSGG — RPGLLNISEPA 971 

II I I M :l : 1:1 M h: I 

Db 972 YSTRYRFYNAQTSVGSGSQITEE--AVTIMDEAGILRPAVGAGKVQPLYPRIRNVTTAA 1028 

Qy 972 AQPWLADTWPNTGNNHNDCSISCCTAG NGNSDSNLTTYSRP 1012 

I: : :| I :| : : II lh : I 

Db 1029 AETYANISWEYEGPDHANFYVEYGVAGSKEDWKKEIVNGSRSFFVLRGLTPGTAYKVRVG 1088 

Qy 1013 ADC IANYNNQLDNRQTNLMLPESTVYGDVDLSNR INEMKTFNSPNLKDGRFVNPSGQPT P 1072 

|: :: : : : t :| : lh: : I h I 

Db 1089 AEGLSGFRSSEDLFETGPAMASR QVDIATQ GWFI — GLMCA 1127 

Qy 1073 YATTQLIQSNLSNNMNNGSGDSGERHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRANDIV 1132 

I II i II :l : h :l h : 

Db 1128 VALLILILLIVCFIRRNKGG KYPVKEK — EDAHADPEI 1163 
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Qy 1133 PPTIPYNQSYDQNTGGSYNSSDRGSSTSGSOGH — KKGARTP 1172 

I Mill: I :: I MM 
Db 1164 QP MKEDDGTFGEYRSLE SDAEDHKPLRRGSRTP 1196 



RESULT 12 
T08851 

Down syndrome cell adhesion protein 1 ■ human (fragment) 
N; Alternate names: Down syndrome cell adhesion molecule 
C; Species: Homo sapiens (man) 

C;Date: ll-Jun-1999 #sequence_revision ll-Jun-1999 ftext_change ll-Jun-1999 

C; Accession: T08851 

R;Yamakawa, K.; Huo, Y.K.; Haendel, M.A.; Hubert, R.; Chen, X.N.; Lyons, G.E.; Korenben 
submitted to the EMBL Data Library, September 1997 
A;Description: DSCAM: A novel member of the immunoglobulin superfamily maps in a down sj 
A;Reference number: Z16495 

I A; Access ion: T08851 
|\; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1896 <YAM> 

A;Cross-references: EMBL:AF023449; NID:g3169765; PID:g3169766 
A; Experimental source: brain; developmental stage: 14 weeks; fetal 
C; Genetics: { 
A; Gene: DSCAM 
A; Map position: 21q22 

A; Note; derived from alternately -spliced mRNA 
C; Function: 

A;Description: involved in nervous system development 
C; Keywords: alternative splicing 



Query Match 8.1%; Score 707.5; DB 2; Length 1896; 

Best Local Similarity 26.8%; Pred. No. 1.5e-25; 

Matches 286; Conservative 147; Mismatches 446; Indels 187; Gaps 48; 

Qy 64 EDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDP---RSHR 119 

II hh I: :ll II :l I :l |l III I I III III 
Db 389 EDGTPRIISAFSERWSPAEPVSLMCNVRGTPLPTITW TLDDDPILKGGSHR 440 

Qy 120 — MLLPSGSLFFLRIVHGRKSRPDEGVYVCyARNYLGEAVSHNASLEVAILRDDFRQNP 176 

I: I:: : : II 1 1 II I I I I : I : I I I 
Db 441 ISQMITSEGNWSYLNISSSQVR-DGGVYRCTANNSAG-WLYQARINV- - -RGPASIRP 495 

Qy 177 SDVMVAV-GEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITI-RGGKLMITYTRKS- 233 
. : h I : h 1:1 :| I I: : I :: I I :: :| 

■pb 496 MKNITAIAGRDTYIHCR-VIGYPYYSIKWYRNSNLLPFNHRQVAFENNGTLKLSDVQKEV 554 

Qy 234 DAGKYVCVGTNMVGERE - - - SEVAELTVLERPSFVKRPSNLAVTVDDSAEFKC * EARGDP 289 
I 1:1 I I" : : |: :|| : I |:: :: I II 
555 DEGEYTC---NVLVQPQLSTSQSVHVTV-KVPPFIQPFEFPRFSIGQRVFIPCVWSGDL 610 



Db 

Qy 290 VPTVRWRKDDGELPKSRYEIRDD— -HTLKIRKVTAGDMGSYTCVAENMVGKAEASATL 345 

I: hll :| I h :|:| :: f : 1 1 1 : 1 I I : I 
Db 611 PITITWQKDGRPIPGSLGVTIDNIDFTSSLRISNLSIiMHNGNYTCIARNEAAAVEHQSQL 670 

Qy 346 TVQEPPHFWKPRDQV7ALGRTVTFQCEATGNPQPAIFWRREGSQNLLFS YQP 398 

I: II Mhllll I: I I I I I I I I: II :|| 

Db 671 IVRVPPKFWQPRDQDGIYGKAVILNCSAEGYPVPTIVWK FSKGAGVPQFQP 722 

Qy 399 PQSSSRFSVSQTGDLTITNVQRSDVGYYICQTLWAGSIITKA-YLEVTDVIADRPPPVI 457 

i MM I MM: I |: Ill : M 
Db 723 IALNGRIQVLSNGSLLIKHWEEDSGYYLCRVSNDVGADVSKSMYLTV KIPAMI 776 

Qy 458 RQGPVNQTVAVDG - TFVLSCVATGSPVPTILWRKDGVLVSTQDSR - - IKQLENG V 509 

I hi I Mill : | |: :::::!: M 
Db 777 TSYP - NTTLATQGQKKEMSCTAHGEKPI IVRWEREDRI INPEMARYLVSTREVGEEVIST 835 

Qy 510 LQIRYAKLGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSRPEV 569 

III 1:1 ::l III : III II : |: 

Db 836 LQILPIVREDSGFFSCHAINSYGEDRGIIQLTVQE PPDPPEIEI 879 

Qy 570 TDVSRNTVTLSWQPNLNSGATPTSYIIEAFSHASGSSWQTVAENVK TETSAIRGL 624 



II hll I : : | MM II : IM - I : 

Db 880 KDVKARTITLRWTMGFDGNSPITGYDIECKN--KSDSWDS-AQRTKDVSPQLNSATIIDI 936 

Qy 625 KPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQRELGNAVLHLHNPT 684 

I:: I 'MM hll : M : :|| 
Db 937 HPSSTYSIRMYAKNRIGKSEPSN-ELTITADEAAPDGPPQE VHLE — 980 

Qy 685 VLSSSSIEVHHTVDQ— QSQYIQGYKILYRPSGANHGESDWLVFEVRTPAKNSV-VIPD 740 

:M II I I : I: 1 : 1 1 : 1 II ! : : I I : I : : 
Db 981 PISSQSIRVTWKAPKKHLQNGIIRGYQIGYREYSIG-GNFQFNIISVDTSGDSEVYTLDN 1039 

Qy 741 LRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPPQGVTVSKNDGNGTAILVSWQP 800 

MM: ||: |||: || lh I : M =11 

Db 1040 LNKFTQYGLVVQACNRAGTGPSSQEI ITTTLEDVPSYPPENVQAIAT - ■ SPESIS ISWST 1097 

Qy 801 PPEDTQNGMVQEYKV--WCLGNETRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTGA 858 

:: II:': I ::| I : I : : |: :| M : : I I I I 
Db 1098 LSKEALNGILQGFRViyWANLMDGELGEIKNITTTQPSLELDGLEKYTNYSIQVLAFTRA 1157 

Qy 859 GSGVKSEPQFIQL- - DAHGNPVSPEDQVS LAQQ I SDWKQP AF I AGIGAACWI I LMVFSI 916 

I Ihll T : III II: II III 

Db 1158 GDGVRSEQIFTRTKEDVPGPP AGVKAAAASASMVFVS 1194 

Qy 917 WLYRHRRKRNGLTSTYAGIRRVPSFTFTPTVTYQRGGEAVSSGGRPGLLNISEPAAQPWL 976 

II I lh III I I I III I I 

Db 1195 WL--PPLKLNGI IRRYTVFCSHPYPTV ISEFEASP" 1227 



977 ADTW— -PNTGNN--HNDCSISCCTAGNGNSDSNLTT— YSRPADCIANYNNQLDNKQ 1027 

I:: 1.1 I :: :: :|| III :| II : 
1228 -DSFSYRIPNLSRNRQYSVWWAVTSAGRGNSSEIIIVEPLAKAPARILTFSGTVTTPWM 1286 

1028 TNLMLPESTVrGDVDLSNRINEMRTFN-SPNLR-DGR— FVNPS 1067 

:::M J II MM MM III Ml 
1287 RDIVLPCRAV— GDPSPAVRWMKDSNGTPSLVTIDGRRSIFSNGS 1329 



RESULT 13 
158164 

BIG-1 protein - rat 

C;Species: Rattus norvegicus (Norway rat) 

CjDate: 26-Jul-1996 isequence.revision 26-Jul-1996 *text_change 24-Sep-1999 
C; Accession: 158164 

R;Yoshihara, Y.; Kawasaki, M,; Tani, A,; Tamada, A.; Nagata, S.; Ragamiyama, H, ; Mori 
Neuron 13, 415-426, 1994 

A;Title: BIG-1: a new TAG -1/F3 -related member of the immunoglobulin superfamily with 
A;Reference number:, 158164; MUID: 94338697 
A; Accession: 158164; 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1028 <RES> 

A; Cross-references : ' EMBL : U1103 1; NID:g563132; PIDN:AAA63607.1; PID:g563133 
C; Genetics: 
A;Gene: BIG-1 

C; Superfamily: contactin; fibronectin type III repeat homology; immunoglobulin homolo 



Query Match : . 8.1%; Score 702.5; DB 2; Length 1028; 

Best Local Similarity 26.0%; Pred. No. l.le-25; 

Matches 243; Conservative 120; Mismatches 361; Indels 209; Gaps 28; 

Qy 68 PRIVEHPSDLIV— SKGEPATLNCRAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPS 124 

I h MM |: : MM I hi I I :M I II I 
Db 26 PVFVKEPSNSIFPVGSEDKKITLNCEARGNPSPHYRWQLNGSDIDTSLD — HRYKLNG 81 

Qy 125 GSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVG 184 

hi II' I II II Mill I IM I : M II I 
Db 82 GNL IVINPNRNWDTGSYQCFATNSLGTIVSREARLQFAYLENFKSRMRSRVSVREG 137 

Qy 185 EPAVMECQPPRGHPEPTISWRKDGSPL- • -DDRDERITIRGGRLMITYTRKSDAGKYVCV 241 

MM M I : :M I M :: I II III Ml 
Db 138 QGWLLCGPPPHSGELS YAWVFNEYPSFVEEDSRRFVSQETGHLY IARVEPSDVG NYTCV 197 

Qy 242 GTNMV— ;-' GERESEVAELTVLERPSFVKRPSNLAVTVDDSAEF 281 
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I: I III :: 

3 VTSTVTNARVLGSPTPLVLRSDGVMGEYEPKIE- - 



:: I I : : 
--LQFPETLPAAKGSTVKL 247 



Oy 282 KCEARGDPVPTVRWRKDDGELPKSRYEIRD-DHTLKIRKVTAGDMGSYTCVAENMVGKAE 340 

:| I 1:111 : II: II :: ::l : l:| I III hill II 
Db 248 ECFALGNPVPQINWRRSDGMPFPTKIRLRRFNGVLEIPNFQQEDTGSYECIAENSRGRNV 307 

Oy 341 ASATLTVQEPPHFWKPRDQWALGRIVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQ 400 

I II I::| :| h :: ::| hi hh I : I =1 
Db 308 ARGRLTYYAKPYWVQLLKDVETAVEDSLYWECRASGKPKPSYRWLKNGDALVL 360 

Oy 401 SSSRFSVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQG 460 

I :: I III I: II I : I I I I : I I: hi I : 
Db 361 -EERIQI -ENGALTIANLNVSDSGMFQCIAENKHGLIYSSAELR- ■ -VLASAPD- -FSRN 413 

Qy 461 PVNQ--TVAVDGTFVLSCVATGSPVPTILWRKDGVLVSIQDSRIKQLENGVLQIRYAKLG 518 

h : I I :| I : II hi :l I ill I :l hi 
Db 414 PMKKMIOVQVGSLVILDCKPSASPRALSFWKKGDTWREQ-ARISLLNDGGLKIMNVTKA 472 

Qy 519 DTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPS 562 

. I I Mill hi : : I I II I II 

MDb 473 DAGIYTCIAENQFGKANGTTQLWTE PTRI ILAPSNMDVAVGESIILPCQV 523 



Db 524 QHDPLLDIMFAWYFNGTLTDFKKDGSHFEKVGGSSSGDLMIRNIOLKHSGKYVCMVQTGV 583 

Qy 563 APSKPE- • -VTDVSRNTVTLSWQPNLNSGATPTSYIIEAFSHASGSS 606 

:| II I ::: I III :l : II ::| : I 
Db 684 DSVSSAAELIVRGSPGPPENVKVDEITDTTAQLSWTEGTDSHSPVISYAVQARTPFS-VG 642 

Qy 607 WQ---TVAENV-KTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTS 661 

II III: Ml:: It I I I hi I :|| |: hh: 
Db 643 WQNVRTVPEAIDGKTRTATWELNPWVEYEFRWASNKIGGGEPSLPSEKVRTEEAAPE- 701 

Oy 662 QGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHKTVDQQSQYIQ GYKILYRPSGA 716 

I :| I | | : ; M : :| || : :|| | 

Db 702 - -VAPSEVSGGGG SRSELVITW- -DPVPEELQNGGGFGYWAFRPLGV 745 

Qy 717 NHGESDWLVFEVRTP AKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKT 770 

: I: I :| :| ::| Ihl = h :| I : : 
Db 746 — TTWIQTWTSPDNPRYVFRNESIVP — FSPYEVKVGVYNNRGEGPFSPVTTVFS 797 

Oy 771 LEEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKV—WCLGNETRYHINK 828 

III: I :| : : : I III I : II : hi I I I 
Db 798 AEEEPTVAPS-HISAHSLSSSEIEVSWNTIPWKSSNGRLLGYEVRYWNNGGEEESSSKV 855 

Qy 829 TVDGSTFSWIPFLVPGIRYSVEVAASTGAGSG 861 . 
II: II: i : I I I I: 
kDb 856 KVAGNQTSAVLRGLKSNLAYYTAVRAYNTAGAG 888 



RESULT 14 
A53449 

plasraacytoma-associated neuronal glycoprotein PANG - mouse 
C; Species: Mus musculus (house mouse) 

C;Date: 25-Aug-1995 #sequence_revision 25-Aug-1995 #text_change 24-Sep-1999 
C; Accession: A53449 

R;Connelly, M.A.; Grady, R.C.; Mushinskl, J.F.; Marcu, R.B. 
Proc. Natl. Acad. Sci. U.S.A. 91, 1337-1341, 1994 
A;Title: PANG, a gene encoding a neuronal glycoprotein, is ectopically activated by inti 
A;Reference number: A53449; MUID: 94151325 
A; Accession: A53449 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-1028 <CON> 

A; Cross -references; GB:L01991; NID:g200056; PIDN:AAA17403,1; PID:g200057 

C; Super family: contactin; fibronectin type III repeat homology; immunoglobulin homology 

C ; Keywords : glycoprotein 



Query Match 

Best Local Similarity 



8.0%; Score 695.5; DB 2; Length 1028; 
25.5%; Pred. No. 2.4e-25; 



Matches 238; Conservative 125; Mismatches 361; Indels 209; Gaps 28; 

Qy 68 PRIVEHPSDLIV- - -SKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPS 124 

I :: lh I h : Mil: I hi I I ::| I II I 
Db 26 PVFIKEPSNSIFPVDSEDKKITLNCEARGNPSPHYRWQLNGSDIDTSLD— -HRYKLNG 81 

Qy 125 GSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVG 184 

hi II ' 111111111111:11: : I I I I 
Db 82 GNL IVHIPNRNWDTGSYQCFATNSLGTIVSREAKLQFAYLENFKTRMRSTVSVREG 137 

Qy 185 EPAVMECQPPRGHPEPT ISWKKDGSPL - - - DDRDERIT IRGGRLMITY T RKSDAGK YVCV 241 

: I: I II | : :| : | :| :: | | | MINI 
Db 138 QGWLLCGPPPHSGELSYAWVFNEYPSFVEEDSRRLVSQETGHLYIAKVEPSDVGNYTCV 197 

Qy 242 GTNMV : GERESEVAELTVLERPSFVKRPSNLAVTVDDSAEF 281 

hi' III:: h I I : 

Db 198 VTSTVTNTRVLGSPTPLVLRSDGVMGEYEPRIE VQFPETLPAAKGSTVRL 247 

Qy 282 KCEARGDPVPTVRWRKDDGELPKSRYEIRD-DHTLRIRRWAGDMGSYTCVAENMVGRAE 340 

:| I hill,: II: II :: ::| : hh I III :||l II 
Db 248 ECFAIXJNPVPQINWRRSDGMPFPNKIRLRRFNGMLEIQNFQQEDTGSYEGIAENSRGKNV 307 

Qy 341 ASATLTVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQ 400 

I II ' j::: II :|: :: ::| hi hh I : I :| 
Db 308 ARGRLTYYAKPYWLQLLRDVEIAVEDSLYWECRASGKPKPSYRWLKNGDALVL 360 

Qy 401 SSSRFSVSQTGDLTITNVQRSDVGYYICQTLNVAGSHTRAYLEVTDVIADRPPPVIRQG 460 

I : : I Mill: :| I : I M I : I h hi I : 
Db 361 -EERIQI-ENGALTITNLNVTDSGMFQCIAENKHGLIYSSAELK—WASAPD-FSRN 413 

Qy 461 PVNQTVAVD--GTFVLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLG 518 

I: : I I Ml II hi ::l I :h I M M 
Db 414 PfflKMVQVQVGSLVILDCRPRASPRALSFWKRGDMMVREQ-ARVSFLNDGGLRIMNVTKA 472 

Qy 519 DTGRYTCI AST PSGEATWSAY IEVQEFGVPVQPPRPTDPNLIPS 562 

I I III I ' hi : :: I I II I II 

Db 473 DAGTYTCTAENQFGKANGTTHLWTE PTRIILAPSNMDVAVGESVILPCQV 523 

Qy 563 562 

Db 524 QHDPLLDIMFAWYFNGALTDFKKDGSHFERVGGSSSGDLMIRNIQLRHSGRYVCMVQTGV 583 

Qy 563 - - -APSKPE- : -VTDVSRNTVTLSWQPNLNSGATPTSYIIEAFSHASGSS 606 

M II I ::: I III M : II ::| : I 
Db 584 DSVSSAAELIVRGSPGPPENVRVDEITDTTAQLSWTEGTDSHSPVISYAVQARTPFS-VG 642 

Qy 607 WQ- - -TVAEW- -KTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTS 661 

II 11.1 : II I: : I I I I : hi I M h |:|:: | 

Db 643 WQSVRTVPEVIDGRTHTATWELNPWVEYEFRIVASNRIGGGEPSLPSERVRTEEAAPE- 701 

Qy 662 QGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQ GYKILYRPSGA 716 

: :h ' I | | : : | | : :| II : Ml I 

Db 702 --IAPSEVSGGGG SRSELVITW- -DPVPEELQNGGGFGYWAFRPLGV 745 

Qy 717 NHGESDWLVFEVRTP ARNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKT 770 

: I: ; l :| :| ::| Ihl : h :| I : : 
Db 746 TTWIQTWTSPDNPRYVFRNESIVP — FSPYEVKVGVYNNKGEGPFSPVTTVFS 797 

Qy 771 LEEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKV--WCLGNETRYHINK 828 

II I: I :| : : : I III I II : hi I I I 
Db 798 AEEEPTVAPS-HISAHSLSSSEIEVSWNTIPWKLSNGHLLGYEVRYWNNGGEEESSRRV 855 

Qy 829 TVDGSTFSWIPFLVPGIRYSVEVAASTGAGSG 861 

I I: I M I : I II Ihl 
Db 856 KVAGNQTSAVLRGLRSNLAYYTAVRAYNSAGAG 888 



RESULT 15 
A39640 

neural cell adhesion molecule Nr-CAM precursor - chicken 
C; Species: Gallus gallus (chicken) 

C;Date: 10-Sep-1999 «sequence_revision 10-Sep-1999 ttext_change 10-Sep-1999 
C;Accession: A39640; !S16451 
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RjGrumet, M.; Mauro, V.; Burgoon, M.P.; Edelman, G.M.; Cunningham/ B.A. 
J. Cell Biol. 113, 1399-1412, 1991 
A;Title: Structure of a new nervous system glycoprotein, Nr-CAM, and its relationship tc 
A;Reference number: A39640; MUID: 91258407 
Accession: A39640 

A; Status: preliminary; not compared with conceptual translation 
A;Molecule type: mRNA 
A; Residues: 1-1268 <GRD> 

A;Cross-references: GB:X58482; NID:g63706; PIDN:CAA41391.1; PID:g63707 
C; Super family: neural cell adhesion molecule LI; fibronectin type III repeat homology; 
C;Keywords: alternative splicing; cell adhesion; duplication; glycoprotein; membrane pre 
F;261-318/Domain: immunoglobulin homology <IMM> 



Query Match 7.9*; Score 693; DB 1; Length 1268; 

Best Local Similarity 24. H; Pred, No, 4 .le-25; 

Matches 267; Conservative 162; Mismatches 445; Indels 232; Gaps 40; 

H| 59 SRLRQE-DFPPRIVEH-PSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPR 116 
W hi :| II I : I I || I : |:|:|:| |: | : | : ||| 

Db 31 SKLLEELSQPPTITQQSPKDYIVDPRENIVIQCEAKGRPPPSFSWTRNGTHFDIDRD--- 87 

Qy 117 SHRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNP 176 

: : I: : h Mil I III I hhl : : :: 
Db 88 AQVTMRPNSGTLWNIMNGVKAEAYEGVYQCTARNERGAAISNNIVIRPSRSPLWTKEKL 147 

Qy 177 SDVMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITIRG--GKLMI1YTRRSD 234 

I I: I: hll Mill: : II:: :| I I : : I 
Db 148 EPNHVREGDSLVLNCRPPVGLPPPIIFWMDNAFQRLPQSERVS-QGLNGDLYFSNVQPED 206 

Qy 235 AG-KYVCVG TNMVGERESEVAEL TVLERPSFVKRP SNLAVTVDDSAEF 281 

Db 207 TRVDYICYARFNHTQTIQQKQPISVW^ L 266 

Qy 282 KCEARGDPVPTVRWRRDDGELPKSRYEIRD-DHTLKIRKVTAGDMGSYTCVAENMVGKAE 340 

:l I I I I :|| |: I F 1 1 :| : II h I hi I I I :| 
Db 267 ECIAAGLPTPVIRWIKEGGELPANRIFFENFKKTLKIIDVSEADSGNYRCTARNTLGSTH 326 

Qy 341 ASATLTVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQ 400 

::N: h" lh I:: I III lll:|:| I I : |: 
Db 327 HVISVTVKAAPYWITAPRNLVLSPGEDGTLICRANGNPKPSISWLTNGVPIAI— -APE 382 

Qy 401 SSSRFSVSQTGDLTI-TNVQRSDVGYYICQTLNVAGSIITRAYLEVTDVIADRPPPVIRQ 459 
II llhll II II:: |:: :|:|: II :: 
383 DPSR— RVDGDTIIFSAVQERSSAVYQCNASNEYGYLLANAFV— NVLAE-PPRILT- 434 



460 G PVNQ - - T VAVDGT FVLSC VATG S PV PT I LWRKDG VLVS - TQDS R IKQIiENG VLQ I RY AK 516 

I I: I I :: I III I I I : III : : :ll hi h 
435 -PANKLYQVIADSPALIDCAYFGSPRPEIEWFR-GVKGSILRGNEYVFHDNGTLEIPVAQ 492 

517 LGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPR PTDPNLI 560 

II UN I: :||:: : :: |: II II 

493 KDSTGTYTCVARNRLGKTQNEVQLEV1DPTMIIKQPQYKVIQRSAQASFECVIKHDPTLI 552 



Qy 781 GVTVSKNDGN5TAILVSWQPPPEDTQNGMVQEYKVW CLGNETRYHINK---TVDG 832 

II : I I I I I : ! :| III: I :: |: I II 
Db 827 NVQV--HVINSTLAKVHWDPVPLKSVRGHLQGYKVYYWRVQSLSRRSKRHVEKKILTFRG 884 

Qy 833 STFSWIPFLVPGIRYSVEVAASTGAG SGVKSEPQFIQ LDA* 873 

: ::l I I I : I II Mill:: ||: 

Db 885 NRTFGMLPGLEPYSSYRLNVRVVNGRGEGPASPDRVFRTPEGVPSPPSFLKITNPTLDSL 944 

Qy 874 — HGNPVSPE — DQVSLAQQISDV- rVRQPAFIAGIGAACWIILMVTSIWL 918 

1:1- l ; : I h: :: II : :|| : : 
Db 945 TLEWGSPTHPNGVLTSYILKFQPIMTHELGPLVEIRIPANESS LILKNLN-YS 997 

Qy 919 YRHRKKRNGLTSTYAGIRRVPSFTFTPTVTYQRGGEAVSSGGR PGLLNISEPAAQ 973 

I:: I IK :l I III: I : I:: lh 

Db 998 IRYKFYFNAQTSVGSG SQITEEAVTIMDEVQPLYPRIRNVTTAAAE 1043 

Qy 974 PWLADTWPNTGNNHNDCSISCCTAGN 999 

: :| I :| : : lh 
Db 1044 T YANI SWEYEGPDHANFYVEYGVAGS 1069 



Search completed: January 22, 2001, 12:27:16 
Job time: 2113 sec 



i :i 
Db 553 PTVIWLRDNNELPDDERFLVGKDNLTIMNVTDRDDGTYTCIVNTTLDSVSASAVLTWAA 612 

Qy 564 PSKP EVTDVSRNIVTLSWQPNLHSGATPTSYIIEAFS ■ - HASGSSWQ 608 

I I hi :: III I : : h::|| I I I 

Db 613 PPTPAIIYARPNPPLDLELTGQLERSIELSWVPGEENNSPITNFVIEYEDGLHEPG-VWH 671 

Qy 609 TVAENVRTETSAIRGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTSQGVDHKQ 668 

I : h I I I I I I I I I II: h !: I ; : 
Db 672 YQTEVPGSHTTVQLKLSPYVNYSFRVIAVNEIGRSQPSEPSEQYLTKSANPDENPSNVQG 731 

Qy 669 VQRELGNAVLHLHNPTVLSSSSIEVBWVDQQSQYIQGYKILYRPSGANHGESDWLVFEV 728 

: I I h : h : lh :| : :| 

Db »732 IGSEPDNLVITWESLKGFQSNGPGLQ YKVSWRQKDV— DDEW 771 

Qy 729 RTPAKNSWIPDLRK GVNYEIKARPFFNEFQGADSEIRFAKTLEEAPSAPPQ 780 

III: :: I I INI : : : : h I I 

Db 772 TSVWANVSRYIVSGTPTFVPYEIKVQALNDLGYAPEPSEVIGHSGEDLPMVAPG 826 



Mon Jan 22 13:04:41 2001 



us-09-540-245a-18.rpr 



Page 14 



• 

i 



t 



f 1 V > 



Mon Jan 22 13:04:42 2001 



us-09-540-245a-18.rsp 



Page 1 



GenCore version 4,5 
Copyright (c) 1993 ■ 2000 Compugen Ltd, 

OM protein • protein search, using sw model 



January 22, 2001, 12:28:41 ; Search time 162.41 Seconds 
(without alignments) 
328.290 Million cell updates/sec 

US-09-540-245A-18 
8724 

1 MKWKHVPFLVMISLLSLSPN VLGGYERGEDNNEELEETES 1651 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 



88757 seqs, 32294092 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0* 
Maximum Match 1004 
Listing first 45 summaries 



Database : 



SwissProt_39:* 



Pred, No, is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



Result 



Query 



NO, 


Score 


Match^ength 


DB 


ID 


Description 


1 


764.5 


8.8 


1260 


1 


CAMLJ10DSE 


P11627 mus musculu 


2 


761 


8.7 


1257 


1 


CAMLJUMAN 


P32004 homo sapien 


3 


760 


8.7 


1493 


1 


NEOlJOUSE 


P97798 mus musculu 


4 


757.5 


8.7 


1443 


1 


NEOl.CHICK 


090610 gallus gall 


5 


755 


8.7 


2012 


1 


DSCAJUMAN 


060469 homo sapien 


6 


746.5 


8.6 


1377 


1 


NEOlJAT 


P97603 rattus norv 


7 


739.5 


8.5 


1259 


1 


CAML.RAT 


005695 rattus norv 


8 


725.5 


8.3 


1461 


1 


NE01_HUMAN 


092859 homo sapien 


9 


701,5 


8,0 


1284 


1 


NRCA.CHICK 


P35331 gallus gall 


10 


679.5 


7.8 


1447 


1 


DCCJ10USE 


P70211 mus musculu 


11 


667.5 


7.7 


1447 


1 


DCCJUMAN 


P43146 homo sapien 


12 


653,5 


7.5 


1040 


1 


AXOl.HUMAN 


002246 homo sapien 


13 


652 


7.5 


1040 


1 


AXOl RAT 


P22063 rattus norv 


14 


633,5 


7.3 


1036 


1 


AX01.CHICK 


P28685 gallus gall 


15 


627 


7.2 


1239 


1 


NRGJJROME 


P20241 drosophila 


16 


596,5 


6.8 


1018 


1 


CONTJUMAN 


Q12860 homo sapien 


17 


589.5 


6.8 


1010 


1 


CONT.CHICK 


P14781 gallus gall 


18 


584 


6.7 


1912 


1 


PTPDJUHMJ 


P23468 homo sapien 


19 


583 


6.7 


2029 


1 


LARJDROME 


P16621 drosophila 


20 


573 


6.6 


1020 


1 


CONTJOUSE 


P12960 mus musculu 


21 


549 


6.3 


1266 


1 


NGCA.CHICK 


003696 gallus gall 


22 


549 


6.3 


1897 


1 


PTPFJUMAN 


P10586 homo sapien 


23 


505 


5.8 


1070 


1 


PTK7JUMAN 


Q13308 homo sapien 


24 


500.5 


5.7 


3707 


1 


PGBMJOOSE 


Q05793 mus musculu 


25 


500 


5.7 


1091 


1 


NCAl.CHICK 


P13590 gallus gall 


26 


497.5 


5.7 


1115 


1 


NCAlJiOUSE 


P13595 mus musculu 


27 


495 


5.7 


4393 


1 


PGBMJUMAN 


P98160 homo sapien 


28 


490 


5.6 


1051 


1 


PTK7.CHICK 


Q91048 gallus gall 


29 


483.5 


5.5 


853 


1 


NCA1JOVIN 


P31836 bos taurus 


30 


478.5 


5.5 


848 


1 


NCA1JUMAN 


P13591 homo sapien 


31 


466 


5.3 


761 


1 


NCA2JUMAN 


P13592 homo sapien 


32 


465 


5,3 


858 


1 


NCA1JAT 


P13596 rattus norv 


33 


460 


5.3 


1092 


1 


NCA2JENLA 


P36335 xenopus lae 



34 


450 


5.2 


837 


L NCM2JOUSE 


35 


449.5 


5.2 


1088 


L NCA1JENLA 


36 


448.5 


5.1 


725 


L NCA2J10USE 


37 








NCM2_HUMAN 


38 


422 '.5 


4^8 


811 


L FS22JROME 


39 


422.5 


4,8 


873 


L FS21JROME 


40 


415,5 


4,8 


1906 


L KMLS.CHICK 


41 


401.5 


4.6 


2481 


L UN52_CAEEL 


42 


398.5 


4.6 


1913 


L KMLSJUMAN 


43 


390.5 


4.5 


1274 


L MYPCJUMAN 


44 


387.5 


4,4 


1270 


L MYPCJOUSE 


45 


379 


4.3 


1271 


L MYPC.CHICK 



035136 mus musculu 
P16170 xenopus lae 
P13594 mus musculu 
015394 homo sapien 
P34083 drosophila 
P34082 drosophila 
P11799 gallus gall 
Q06561 caenorhabdi 
Q15746 homo sapien 
014896 homo sapien 
070468 mus musculu 
Q90688 gallus gall 



CAMLJOUSE STANDARD; PRT; 1260 AA. 
P11627; 

01-OCT-1989 (tel. 12, Created) 

01-OCM989 (tel. 12, Last sequence update) 

01-OCT-2000 (tel. 40, Last annotation update) 

NEURAL CELL ADHESION MOLECULE Ll PRECURSOR (N-CAM Ll). 

L1CAM OR CAML1. 

Mus musculus (Mouse). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
[1] 

SEQUENCE FROM N.A., AND PARTIAL SEQUENCE. 
TISSUE-BRAIN; 

MEDLINE-88318924; PubMed-3412448; 

Moos M., Tacke R. , Scherer H., Teplov D,, Frueh K,, Schachner M.; 
"Neural adhesion molecule Ll as a member of the immunoglobulin 
superfamily with binding domains similar to fibronectin."; 
Nature 334:701-703(1988). 

-!- FUNCTION: CELL ADHESION MOLECULE WITH AN IMPORTANT ROLE IN THE 
DEVELOPMENT. OF THE NERVOUS SYSTEM. INVOLVED IN NEURON- NEURON 
ADHESION, NEURITE FASCICULATION, OUTGROWTH OF NEURITES, ETC. BINDS 
TO AXONIN OK NEURONS. 

-!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

-i- ALTERNATIVE PRODUCTS: TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 
PRODUCED BY DIFFERENTIAL SPLICING (BY SIMILARITY). 

-!- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN'LIKE C2-TYPE DOMAINS. 

-!- SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed, Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an' email to licenseSisb-sib.ch). 

EMBL; X12875,- CAA31368.1; -. 
PIR; S05479; S05479. 
HSSP; P20241; 1CFB. 
MGD; MGI : 96721 ;' L1CAM. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 4. 
PFAM; PF00047; ig; 6. 
PRINTS; PR00014; FNTYPEIII, 

Cell adhesion; Glycoprotein; Transmembrane; Repeat; Brain; 
Immunoglobulin domain; Signal; Alternative splicing. 



FT 


SIGNAL 
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FT 


CHAIN 
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1260 


NEURAL CELL ADHESION MOLECULE Ll. 


FT 


DOMAIN 


20 


1123 


EXTRACELLULAR (POTENTIAL) , 


FT 


TRANSMEM 


1124 


1146 


POTENTIAL. 


FT 


DOMAIN 


1147 


1260 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 
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120 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


150- 


215 


IG-LIKE C2-TYPE DOMAIN. 
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FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT SITE 

FT SITE 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

•CARBOHYD 
CARBOHYD 
CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT VARSPLIC 
FT 



256 
346 
440 
531 
827 
932 



318 
410 
503 
599 
896 
994 



1032 1094 
553 555 



562 
100 
202 
246 
293 
432 
478 
489 
504 
587 
670 
725 
776 
824 
848 
875 



564 
100 
202 
246 
293 
432 
478 
489 
504 
587 
670 
725 
776 
824 



IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
FIBRONECTIN TYPE- III. 
FIBRONECTIN TYPE- III. 
FIBRONECTIN TYPE- III. 
CELL ATTACHMENT SITE (POTENTIAL) . 
CELL ATTACHMENT SITE (POTENTIAL). 



875 
968 
978 978 
1022 1022 
1030 1030 
1073 1073 
1107 1107 
1180 1183 



SQ SEQUENCE 1260 AA; 140968 



N-LINKED (GLCNAC 
N-LINKED (GLCNAC 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC 
N-LINKED (GLCNAC 
N-LINKED (GLCNAC, 
MISSING (IN SHORT ISOFORM) 
(BY SIMILARITY) . 
MW; 22BE57001CB2A538 CRC64 , 



(POTENTIAL) . 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL), 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 



Query Match 8.8%; Score 764.5; DB 1; Length 1260; 

Best Local Similarity 23.3%; Pre! No. 5.4e-27; 

Matches 307; Conservative 177; Mismatches 510; Indels 325; Gaps 51; 

Qy 56 YTGSRLRQEDFPPRIVEH-PSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDK" 112 

I I : : I! I I I hi : :| 1:1 III III:: 
Db , 26 YKGHHVLE— PPVITEQSPRRLWFPTDDISLKCEARGRPQVEFRWTKDGIHFKPKEEL 82 

Qy 113 DDPRSHRfiLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAI 167 

: I I : : I I :|:| I I I II Ml h : 

Db 83 GVWHEAPYSGSFTIEGNNSFAQRF ^ --QGIYRCYASNKLGTAMSH — EIQL 129 

Qy 168 LRDDFRQNPSD — VMVAVGEPAVMECQPJ GHPEPTISWKKDGSPLDD-KDERITI- 220 



II II I: I I 



130 VAEGAPKWPKETVKPVEVEEGESWLPCNPP SAAPPRIYWM- -NSKIFDIKQDERVSMG 187 

221 RGGKLMITYTRKSD-AGKYVCVGTNMVGER-f ESEVAELTVLERPSFVKRPSNLAVTVD 276 

: I I II hi : I I : I :| I I : I I : 
188 QNGDLYFANVLTSDNHSDYIC-NAHFPGTRTIIQKEPIDLRVKPTNSMIDRKPRLLFPTN 246 



III 1:1 :llh:: 



Qy 277 DSAE FKCEARGDPVPTVRWRKDDGELPKSRYEIRDDH--TLKIRKVTAGD 324 

I: :| I I I I::! =1 I I :l lh: I I 

Db 247 SSSRLVALQGQSLILECIAEGFPTPTIKWLHPSDPMPTDRV-IYQNHNKTLQLLNVGEED 305 

Qy 325 MGSYTCVAENMVGKAEASATLTVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFW 384 

I Mhll :| I : :lh h:: lh : I'l hi III I I 
Db 306 DGEYTCLAENSLGSARHAYYVTVEAAPYWLQKPQSHLYGPGETARLDCQVQGRPQPEITW 365 

Qy 385 RREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITKAYLE 444 

II I : :: : I I I ::lll :l h I I :: lh 
Db 366 RING MSMETVNKDQKYRIEQ-GSLILSNVQPTDTMVTQCEARNQHGLLLANAYIY 419 

Qy 445 VTDVIADRPPPVIRQGPVNQT-VAVDG-TFVLSCVATGSPVPTILWRKDGVLVSTQDSRI 502 

I : I : III :lhl Mil 1 : 1 1 1 : : I : III 
Db 420 WQL PARILTKD--NQTYMAVEGSTAYLLCKAFGAPVPSVQWLDEEGTTVLQDERF 473 



Qy 503 KQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPT-- 

III II : Mill I h II -hi I II 



Db 474 FPYANGTLSIRDLQANDTGRYFCQAANDQNNVTILANLQVKEATQITQGPRSAIEKKGAR 533 
Qy 556 DPNL 559 

Ihl 

Db 534 VTFTCQASFDPSLQASITWRGDGRDLQERGDSDKYFIEDGKLVIQSLDYSDQGNYSCVAS 593 

Qy 560 - IPSAPSKPEVTD— VSRNTVTLSWQPNLNSGATPTSYIIE-AF 599 

; I h = l : :: I III I : : I II 
Db 594 TELDEVESRAQLLWGSPGPVPHLELSDRHLLKQSQVHLSWSPAEDHNSPIEKYDIEFED 653 

Qy 600 SHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLP 659 

: I :: : :|| II I I I I I 1 1 : 1 1 : h I I : I 
Db 654 KEMAPEKWFSIjGKVPGNQTSTTLKLSPYVHYTFRVTAINKYGPGEPSPVSESWTPEAAP 713 

Qy 660 TSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHG 719 

II : ; II I: : : 1 ;| ; || h: :ll I 

Db 714 EKNPVDVRGEGNETNNMVI TWKPLRW-MDWNAPQIQ-YRVQWRPQGK-- 758 

Qy 720 ESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPP 779 

: I I i| :h : I Nil : h :| : :: : h I I 
Db 759 QETWRKQTVSDP- • -FLWSNTSTFVPYEIKVQAVNNQGKGPEPQVTIGYSGEDYPQVSP 815 

Qy 780 ■-QGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKV--WCLGNETRY---HINKT-- 829 

: :|: ■■ I : :|| hi I :: I I I h: :: Ihh 
Db 816 ELEDITIF-T-NSSTVLVRWRPVDLAQVKGHLKGYNVTYWWKGSQRKHSKRHIHKSHIV 871 

Qy 830 VDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSE PQFIQLDAHG- 875 

I ill :: I I I III I I II II h : h 

Db 872 VPANTTSAILSGLRPYSSYHVEVQAFNGRGLGPASEWTFST PEGVPGH PEALHLECQSDT 931 

Qy 876 1 NPVSPEDQVSLAQQISDWKQPAFIAGIGAACWIILM 912 

ill I : I :M 

Db 932 SLLLHWQPPLSHNGVLTGYLLSYHPVEGESKEQLFFNLSD 971 



Qy 913 VFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRG--GEAV SSGGRPG 963 

: : II: : I I hi llh : hi 

Db 972 PELRTHNLTNLNPDLQ— -YRFQLQATTQQGGPGEAIVREGGTMALFGKPD 1019 

Qy 964 LLNISEPAAQPWLADTW-PNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYNN- 1021 

III I : : :| I I h : :: :| -\\ 

Db 1020 FGNISATAGENYSWSWVPRKG — QCNFRFHILFKALPEGKVSPDHQPQPQYVSYNQS 1075 

Qy 1022 QLDNKQTNLMLPESTVYGDVDLSN KINEMKTFNS 1055 

I I.I I : :|: ::: :l I 

Db 1076 SYTQWNLQPDTKYEIHLIKEKVLLHHLDVKTNGTGPVRVSTTGSFASEGWFIAFVSAIIL 1135 

Qy 1056 PNLKDGRF VNPSGQPTPYATTQLIQSNLSNNMNNGSGDSG 1095 

I h: h :l I :l hi I I 

Db 1136 LLLILLILCFIKRSKGGKYSVKDKEDTQVDSEARPMKDETFGEYRSLESDNEEKAFGSSQ 1195 

Qy 1096 EK— HWKPLGQQKQEV APVQYN— -IVEQNKLNKDYRA— NDTVPPTIPYN 1139 

llll Ihl : I h I lh I I I 

Db 1196 PSLNGDIKPLGSDDSLADYGGSVDVQFNEDGSFIGQYSGKKEKEAAGGNDSSGATSPIN 1254 



RESULT 2 
CAMLJUMAN 
ID 
AC 
DT 
DT 
DT 



PRT; 1257 AA. 



CAMLJUMAN, STANDARD; 
P32004; 

01-JUL-1993 (ReL 26, Created) 
01-OCT-1996 (ReL 34, Last sequence update) 
01-OCT-2000 (ReL 40, Last annotation update) 
NEURAL CELL ADHESION MOLECULE LI PRECURSOR (N-CAM Ll). 
L1CAM OR CAMLl OR MIC5. 
Homo sapiens (Human) , 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
[1] 

SEQUENCE FROM NlA. 

MEDLINE-92031696; PubMed=1932117; 

Robayashi M., Miura M., Asou H., Uyemura K.; 

"Molecular cloning of cell adhesion molecule Ll from human nervous 

tissue: a comparison of the primary sequences of Ll molecules of 
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RT different origin."; 

RL Biochim. Biophys. Acta 1090:238-240(1991), 

RN j [2] 

RP„; SEQUENCE FROM N.A, 

RA Rosenthal A., Coutelle O., Drescher B.; 

RL Submitted (APR-1994) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.Jff 

RX MEDLINE-92329299; PubMed-1627459; 

RA Reid R.A., Hemperly J.J.; 

RT "Variants of human LI cell adhesion molecule arise through alternate 

RT splicing of RNA."; 

RL J. Mol. Neurosci. 3:127-135(1992). 

RN [4] 

RP SEQUENCE OF 353-1176 FROM N.A. 

RX MEDLINE=92020233; PubMed-1923824; 

RA Rosenthal A., Mackinnon R.N., Jones D.S.C.; 

• "PCR walking from microdissection clone M54 identifies three exons 
from the human gene for the neural cell adhesion molecule LI 
(CAM-Ll)."; j 

RL Nucleic Acids Res. 19:5395-5401(1991). i' 

RN [5] > t 

RP SEQUENCE OF 332-371 FROM N.A. $ 

RX MEDLINE-90353957; PubMed-2387585; 

RA Djabali M., Mattei M.-G., Nguyen c, Roux D., 'Deraengeot J., 

RA Denizot F., Moos M., Schachner M., Goridis c, Jordan B.R.; 

RT "The gene encoding LI, a neural adhesion molecule of; the 

RT immunoglobulin family, is located on the X chromosome in mouse and 

RT man."; 

RL Genomics 7:587-593(1990). 

RN [6] 

RP SEQUENCE OF 1030-1257 FROM N.A. 

RX MEDLINE-91132183; PubMed-1993895; 

RA Harper J.R., Prince J.T., Healy P. A., Stuart J.K., Nauman S.J., 

RA Stallcup W.B.; 

RT "Isolation and sequence of partial cDNA clones of human LI: homology 

RT of human and rodent LI in the cytoplasmic region."; 

RL J. Neurochem. 56:797-804(1991). 

RN [7] 

■ RP SEQUENCE OF 20-36. 

RX MEDLINE=88298876; PubMed»3136168; 

RA Wolff J.M., Frank R., Mujoo K., Spiro R.C., Reisfeld R.A., 

RA RathjenF.G.; 

RT "A human brain glycoprotein related to the mouse cell adhesion 

RT molecule LI." ; 

•J. Biol. Chem. 263:11943-11947(1988). 
[8] 
VARIANT HSAS TYR-264. 

RX MEDLINE=94004956; PubMed-8401576; 

RA Jouet M., Rosenthal A,, Macfarlane J,, Kenwrick S., Donnai D.; 

RT "A missense mutation confirms the Ll defect in x-linked hydrocephalus 

RT (HSAS)."; 

RL Nat. Genet. 4:331-331(1993). 

RN [9] 

RP VARIANT HSAS/MASA LEU-1194. 

RX MEDLINE=95187172; PubMfed-7881431; 

RA Fransen E, , Schrander-StumpeLC, Vits L., Coucke P., van Camp G., 

RA Willems P.J.; 

RT "x-linked hydrocephalus and MASA syndrome present in one family are 

RT due to a single missense mutation in exon 28 of the LlCAM gene."; 

RL Hum. Mol. Genet. 3:2255-2256(1994). 

RN [10] 

RP VARIANTS HSAS GLN-184 AND ARG-452, AND VARIANT MASA GLN-210. 

RX MEDLINE-95004608; PubMed-7920659; 

RA Jouet M., Rosenthal A., Armstrong G., Macfarlane J., Stevenson R. , 

RA Paterson J., Metzenberg A., Ionasescu V., Temple K., Kenwrick S.; 

RT "X-linked spastic paraplegia (SPG1), MASA syndrome and X-linked 

RT hydrocephalus result from mutations in the Ll gene."; 

RL Nat. Genet. 7:402-407(1994). 

RN [11] 

RP VARIANTS MASA GLN-210 AND ASN-598. 

RX MEDLINE-95004609; PubMed=7920660; 

RA Vits L., van Camp G., Coucke P., Fransen E., de Boulle K., 



RA Reyniers E., Korn B., Poustka A., Wilson G., Schrander-Stumpel C, 

RA Winter R.M., Schwartz C, Willems P.J.; 

RT "MASA syndrome is due to mutations in the neural cell adhesion gene 

RT LlCAM."; 

RL Nat. Genet. 7:408-413(1994). 

RN [12] 

RP VARIANTS HSAS/MASA S-9; S'121; K-309; P-768; L-941 AND C-1070. 

RX MEDLINE-9528277'6; PubMed=7762552; 

RA Jouet M., Moncla A,, Paterson J., McKeown C, Fryer A., Carpenter N, , 

RA Holmberg E., Wa'delius C, Kenwrick S.; 

RT "New domains of neural cell-adhesion molecule Ll implicated in 

RT X-linked hydrocephalus and MASA syndrome,"; 

RL Am. J, Hum. Genet. 56:1304-1314(1995). 

RN [13] 

RP VARIANTS HSAS/MASA Q-184; Q-210; Y-264; R-452; N-598 AND L-1194. 

RX MEDLINE=96153146; PubMed-8556302; 

RA Fransen E., Lemmon V,, van Camp G., Vits L. , Coucke P,, Willems P,J,; 

RT "CRASH syndrome!: clinical spectrum of corpus callosum hypoplasia, 

RT retardation, adducted thumbs, spastic paraparesis and hydrocephalus 

RT due to mutations in one single gene, Ll."; 

RL Eur. J. Hum. Genet. 3:273-284(1995). 

RN [14] 

RP ERRATUM, ' ■ 

RA Fransen E., Leimon V., van Camp G., Vits L., Coucke P., Willems P.J.; 

RL Eur. J. Hum/ Genet. 4:126-126(1996). 

RN [15] 

RP VARIANTS HSAS/MASA/SPG1 SER-179 AND ARG-370. 

RX MEDLINE-96057511; PubMed-7562969; 

RA Ruiz J.C., Cuppans H., Legius E., Fryns J.-P., Glover T. , Marynen P., 

RA Cassiman J.-J.; 

RT "Mutations in Ll-CAM in two families with X linked complicated 

RT spastic paraplegia, MASA syndrome, and HSAS."; 

RL J. Med. Genet. 32:549-552(1995). 

RN [16] 

RP VARIANTS HSAS CrS-194 AND LEU-240. 

RX MEDLINE=97083370; PubMed-8929944; 

RA Gu S.-M., Orth u., Veske A., Enders H., Kluender K., Schloesser M., 

RA Engel W,, Schwinger E., Gal A.; 

RT "Five novel mutations in the LlCAM gene in families with X linked 

RT hydrocephalus."; 

RL J. Med. Genet. 33:103-106(1996), 

RN [17] 

RP VARIANTS HSAS Q-184; V-439--T-443 DEL; C-784 AND L-936-L-948 DEL, 

RX MEDLINE-97338664; PubMed-9195224; 

RA Macfarlane J.R., Du J.-S., Pepys M.E., Ramsden S., Donnai D., 

RA Charlton R. , Garrett C, Tolmie J., Yates J.R.W., Berry C, Goudie D., 

RA Moncla A,, Lunt'P., Hodgson S., Jouet M., Kenwrick S.; 

RT "Nine novel Ll CAM mutations in families with x-linked 

RT hydrocephalus,"-; 

RL Hum. Mutat. 9:512-518(1997). 

RN [18] { 

RP VARIANTS HSAS/MASA ASP-691; ARG-698 AND PRO'935. 

RX MEDLINE-98180721; PubMed=9521424; 

RA Du Y.-Z., Srivastava A.K., Schwartz C.E.; 

RT "Multiple exon screening using restriction endonuclease 

RT fingerprinting : (REF): detection of six novel mutations in the Ll cell 

RT adhesion molecule (LlCAM) gene."; 

RL Hum. Mutat, 11:222-230(1998). 

RN [19] 

RP VARIANT CRASH PRO- 63 2. 

RX MEDLINE-98112489; PubMed=9452110; 

RA Vits L., Chitayat D., van Camp G., Holden J.J. A., Fransen E., 

RA Willems P.J.;' ' 

RT "Evidence for somatic and germline mosaicism in CRASH syndrome."; 

RL Hum. Mutat. SupDl. 1:5284-8287(1998) . 

RN [20] r 

RP VARIANTS HSAS/MASA THR-219; ARG-335; CYS'386; CYS-473 AND LEU-1224. 

RX MEDLINE-9841572'6; PubMed=9744477; 

RA Saugier-Veber P., Martin C, le Meur N. Lyonnet S., Munnich A., 

RA David A., Henocq A,, Heron D., Jonveaux P., Odent S., Manouvrier S., 

RA Moncla A., Morichon N., Philip N., Satge D., Tosi M., Frebourg T, ; 

RT "Identification of novel LlCAM mutations using fluorescence-assisted 

RT mismatch analysis."; 
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RL Hum. Mutat 12:259-266(1998). 

CC -!• FUNCTION: CELL ADHESION MOLECULE WITH AN IMPORTANT ROLE IN THE 

CC DEVELOPMENT OF THE NERVOUS SYSTEM. INVOLVED IN NEURON-NEURON 

CC ADHESION, NEURITE FASCICULATION, OUTGROWTH OF NEURITES, ETC. BINDS 

CC TO AXONIN ON NEURONS. 

CC -!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

CC -!- ALTERNATIVE PRODUCTS: TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 

CC PRODUCED BY DIFFERENTIAL SPLICING. 

CC -!• DISEASE: DEFECTS IN L1CAM ARE THE CAUSE OF THREE X- LINKED 

CC 1 SYNDROMES. 1: HYDROCEPHALUS OWING TO STENOSIS OF THE AQUEDUCT OF 

CC SYLVIUS (HSAS*. CHARACTERIZED BY MENTAL RETARDATION AND ENLARGED 

CC BRAIN VENTRICLES. 2: MASA SYNDROME WHICH IS CHARACTERIZED BY 

CC MENTAL RETARDATION, APHASIA, SHUFFLING GAIT, AND ADDUCTED THUMBS. 

CC HAS AN OVERLAPPING PROFILE OF CLINICAL SIGNS WITH HSAS, BUT WITH A 

CC MILDER PRESENTATION AND A LONGER LIFE EXPECTANCY. 3: SPASTIC 

CC PARAPLEGIA TYPE 1 (SPG1). COLLECTIVELY THESE SYNDROMES ARE ALSO 

CC KNOWN AS CRASH SYNDROME, AN ACRONYM WHICH STANDS FOR CORPUS 

CC CALLOSUM HYPOPLASIA, PSYCHOMOTOR RETARDATION, ADDUCTED THUMBS, 

CC SPASTIC PARAPARESIS, AND HYDROCEPHALUS. 

CC -!- DISEASE: DEFECTS IN L1CAM ARE THE CAUSE OF HIRSCHPRUNG DISEASE 
(HSCR) . 

-!• SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
-!- SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS . 
-!- DATABASE: NAME-L1CAM; NOTE-L1CAM mutation Web Page; 
WWW- ■ http : //hgins . uia . ac . be/dnalab/11" . 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinforraatics- and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license?isb-sib.ch). 



f 

CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 



DR EMBL; X59847; CAA42508.1; -. 

DR EMBL; Z29373; CAA82564.1; -. 

■ DR EMBL; M74387; AAA59476.1; -. 

DR EMBL; X58775; CAA41576.1; -. 

Query Match 8.71; Score 761; DB 1; Length 1257; 

Best Local Similarity 24.9*; Pred. No. 7.7e-27; 

Matches 283; Conservative 159; Mismatches 422; Indels 274; Gaps 45; 

Qy 14 LLSLSPNHLFLAQLIPDPEDVERGNDHGTPIPTSDNDDNSLGYTGSRXRQEDFPPRIVEH 73 

IMI II II: I : I: I |: I I |:| 

Db 11 LLLCSP CLLIQIPEEYEGHHVMEPPVIT * -EQSPRRLWF 48 

Qy 74 PSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKD -J5PRSHRMLLPSGS 126 

■ 1:1 I :| 1:1 hi I : I : :: |l I : : 
^b 49 PTDDI SLKCEASGKPEVQFRWTRDGVHFKPKEELGVTVYQSPHSGSFTITGNN 101 

Jf' 127 LFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSD— -VMVA 182 

I :: :hl I I I II hll |: :: : : I : I I 

Db 102 SNF AQRFQGIYRCFASNKLGTAMSH — EIRLMAEGAPKWPKETVKPVEVE 149 

Qy 183 VGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITI-RGGKLMITYTRKSD-AGKYVC 240 

II h I II II :IH:|: Ml II hi 

Db 150 EGESWLPCNPPPSAEPLRIYWMNSKILHIKQDERVTMGQNGNLYFANVLTSDNHSDYIC 209 

Qy 241 — VGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAE FKCEAR 286 

II : ::l =11 hi I : h :| I 

Db 210 HAHFPGTRTIIQKEP--IDLRVKATNSMIDRKPRLLFPTNSSSHLVALQGQPLVLECIAE 267 

Qy 287 GDPVPTVRWRKDDGELPKSRYEIRD-DHTLKIRKVTAGDMGSYTCVAENMVGKAEASATL 345 

I I II:: : 1:1 I :: : ||:: || I |:||| :|| : : 
Db 268 GFPTPTIKWLRPSGPMPADRVTYQNHNKTLQLLKVGEEDDGEYRCLAENSLGSARHAYYV 327 

Qy 346 TVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRF 405 

II: I::: lh : I I hi III : III : :: 

Db 328 TVEAAPYWLHKPQSHLYGPGETARLDCQVQGRPQPEVTWRING IPVEELAKDQKY 382 

Qy 406 SVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQT 465 

: I I I ::||| II h I I :: lh I : I :: III 



Db 383 RI ■QRGALILSNVQPSDTMVTQCEARNRHGLLLANAYIYWQL — PAKILTAD- -NQT 435 

Qy 466 * VAVDG -TFVLSCVATGSPVPT ILW - RKDGVLVSTQDSRIKQLENGVLQI RYAKLGDTGR 522 

:|| I I III 1:111:: I :|| I II I II I II : Ml 
Db 436 YMAVQGSTAYLLCKAFGAPVPSVQWLDEDGTTV-LQDERFFPYANGTLGIRDLQANDTGR 494 

Qy 523 YTCIASTPSGEATWSAYIEVQEFGVPVQPPRPT DPNLIPS — 562 

I hi: i I I :M: I || I M 
Db 495 YFCLAANDQNNVTIMANLKVKDATQITQGPRSTIEKKGSRVTFTCQASFDPSLQPSITWR 554 



Qy 563 ■ 



- APSKPEVTD 571 

: I :: =1 : 

Db 555 GDGRDLQELGDSDKYFIEDGRLVIHSLDYSDQGNYSCVASTELDWESRAQLLWGSPGP 614 

Qy 572 rVSRNTVTLSWQPNLNSGATPTSYIIE-AFSHASGSSWQTVAENVKTETS 619 

.':::: MM I : I III : M: : :|| 

Db 615 VPRLVLSDLHLLTQSQVRVSWSPAEDHNAPIEKYDIEFEDKEMAPEKWYSLGKVPGNQTS 674 

Qy 620 AIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQRELGNAVLH 679 

II i I I I I II :M :|: I I : I III III: 

Db 675 TTLKLSPYVHYTFRVTAINKYGPGEPSPVSETWTPEAAPEKNPVDVKGEGNETTNMVI- 733 

Qy 680 LHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHGESDWLVFEVRTPAKNSVVIP 739 

:, : I : : :| h: :|| I III :|: 

Db 734 -— TWKPLRW-MDWNAPQVQ-YRVQWRPQGT— RGPWQEQIVSDP--FLWS 776 

Qy 740 DLRKGVNY E I KARPFFNEFQG ADSE I KFAKTLEE^PSAPP - - QGVTVSKNDGNGTAI LVS 797 

: I llli : :: :|: :: MM I I M : I Mil 

Db 777 NTSTFVPYEIKVQAVNSQGKGPEPQVTIGYSGEDYPQAIPELEGIEIL — NSSAVLVK 832 

Qy 798 WQPPPEDTQNGMVQEYKV--WCLGNETRY--HINK---TVDGSTFSWIPFLVPGIRYS 849 

hi M: I I I h: :: M I :| ||:: | | 

Db 833 WRPVDLAQVKGHLRGYNVTYWREGSQRKHSKRHIHKDHVWPANTTSVILSGLRPYSSYH 892 

Qy 850 VEVAASTGAGSGVKSE PQFIQLDAHGN PVS 879 

M I I IN || |: : h I hi 

Db 893 LEVQAFNGRGSGPASEFTFSTPEGVPGHPEALHLECQSNTSLLLRWQPPLSHNGVLTGYV 952 

Qy 880 — PEDQVSLAQQISDVVKQPAFIAGIGAACWIILMVFSIWLYRHRKKRNGLTSTYAGI 935 

I h . hi :M : Ml : 

Db 953 LSYHPLDEGGKG-QLSFNLRDPEL RTHNLTDLSPHL 987 

Qy 936 RKVPSFTFTPTVTYQRG-GEA-VSSGGRPGLL NISEPAAQPWLADTW-PNTG 984 

I M I M III III I III I : : :| I I 
Db 988 R YRFQLQATTKEGPGEAIVREGGTMALSGISDFGNISATAGENYSWSWVPKEG 1041 



STANDARD; PRT; 1493 AA. 
P97798; ',: 
01-OCT-2000 (Rel. 40, Created) 
01-OCT-2000 (Rel. 40, Last sequence update) 
01-OCT-2000 (Rel. 40, Last annotation update) 
NEOGENIN PRECURSOR, 



GN NEOl OR NGN, 



Mus musculus (Mouse). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
[1] 

SEQUENCE FROM N.A., AND ALTERNATIVE SPLICING. 
TISSUE-BRAIN* 

MEDLINE-97407661; PubMed-9264410; 
Keeling S.L., Gad J.M., Cooper H.M.; 

"Mouse neogenin, a DCC-like molecule, has four splice variants and is 
expressed widely in the adult mouse and during embryogenesis."; 
Oncogene 15:691-700(1997). 

-!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

-!- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

-I- ALTERNATIVE PRODUCTS : AT LEAST 5 ISOFORMS; 1 (SHOWN HERE), 2, 3, 4 
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CC AND 5; ARE PRODUCED BY ALTERNATIVE SPLICING. THE EXPRESSION OF 
CC ISOFORMS 3, 4 AND 5 ARE DEVELOPMENT ALLY REGDLATED. 

CC -!- TISSUE SPECIFICITY: WIDELY EXPRESSED. 

CC •!• DEVELOPMENTAL STAGE: EXPRESSED UBIQUITOUSLY THROUGHOUT THE MID TO 
CC LATE STAGES OF GESTATION AND IN ADULT TISSUES. STRONG EXPRESSION 
CC IS OBSERVED IN THE VENTRAL REGION OF THE VENTRICULAR ZONE OF THE 
CC E15.5 MOUSE NEURAL TUBE, AS WELL AS IN THE VENTRICULAR ZONES OF 
CC THE MESENCEPHALON AND RHOMBENCEPHALON. ISOFORMS 3 AND 4 ARE 
CC EXPRESSED AT HIGHER LEVEL COMPARED TO OTHER ISOFORMS BETWEEN Ell. 5 
CC AND E16.5. 

CC -!- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC -!• SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE HI-LIKE DOMAINS. 

CC -!- SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
CC TUMOR SUPPRESSOR PROTEIN DCC. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinforaatics and| the EMBL outstation - 

M the European Bioinformatics Institute. There are no restrictions on its 

H use by non-profit institutions as long as its content is in no way 

xu modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to licenseGisb-sib.ch). 

CC 

DR EMBL; Y09535; CAA70727.1; -. 

DR HSSP; P02751; 1TTG. 

DR MGD; MGI: 1097159; NEOl. 

DR INTERPRO; IPR001777; ■. 

DR INTERPRO; IPR003O06; ■. 

DR PFAM; PF00041; fn3; 6. 

DR PFAM; PF00047; ig; 4. 

DR PRINTS; PR00014; FNTYPEIII. 



KW 


Alternative splicing. 






FT 


SIGNAL 


1 


36 


POTENTIAL. 




FT 


CHAIN 


37 


1493 


NEOGENIN . 




FT 


DOMAIN 


37 


1136 


EXTRACELLULAR (POTENTIAL). 
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TRANSMEM 


1137 


1157 


POTENTIAL. 




FT 


DOMAIN 


1158 


1493 


CYTOPLASMIC (POTENTIAL) . 


FT 


DOMAIN 


78 


147 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


177 


239 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


274 


338 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


366 


428 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


467 


564 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


567 


660 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


661 


760 


FIBRONECTIN TYPE-III. 






DOMAIN 


766 


860 


FIBRONECTIN TYPE-III, 




I 


DOMAIN 


881 


981 


FIBRONECTIN TYPE-III. 






DOMAIN 


982 


, 1083 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


1149 


1153 


POLY-VAL. 




FT 


CARBOHYD 


84 


* 84 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


221 


221 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


337 


337 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


501 


501 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


520 


520 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


670 


670 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


746 


746 


N-LINKED (GLCNAC. . ,) 


(POTENTIAL). 


FT 


CARBOHYD 


940 


940 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


VARSPLIC 


442 


461 


MISSING (IN ISOFORM 2) 




FT 


VARSPLIC 


863 


878 


MISSING (IN ISOFORM 3) 




FT 


VARSPLIC 


1086 


1096 


MISSING (IN ISOFORM 4) 




FT 


VARSPLIC 


1279 


1331 


MISSING (IN ISOFORM 5) 




SQ 


SEQUENCE 


1493 AA; 163159 MW; 441DE919D5E17C0E CRC64; 



Query Match 8.7%; Score 760; DB 1; Length 1493; 

■ Best Local Similarity 23.2%; Pred. No. l.le-26; 
Matches 385; Conservative 220; Mismatches 608; Indels 446; Gaps 72; 

Qy 70 IVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSLFF 129 

Mill: :l III I |:| III I I : : II I III |||| 
Db 67 LVE-PVDTLSVRGSSVILNCSAYSEPSPNIEWKKDGTFLNLESDD— RRQLLPDGSLFI 122 

Qy ■ 130 LRIVHGRKSRPDEGVYVCVAR-NYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPAV 188 



Ml : -hill I 1 1 1 : 1 1 1 1 I I 1 1 I I I I III: 
Db 123 SNWHSKHNKPDEGFYQCVATVDNLGTIVSRTAKLTVAGL-PRFTSQPEPSSVYVGNSAI 181 

Qy 189 MECQPPRGHPEPTISWKKDGSPLDDKDERITIRGGKLMITYTRKSDAGKYVCVGTNMVGE 248 

: I: I : II I : : I |:|: Mill: : 
Db 182 LNCE-VNADLVPFVRWEQNRQPLLLDDRIVKLPSGTLVISNATEGDGGLYRCIVESGGPP 240 

Qy 249 RESEVAELTVLERPS FVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDGEL 302 

: h III U: I I: III:: II I I I I III |:: I 
Db 241 KFSDEAELKVLQDPEEIVDLVFLMRPSSMMKVTGQSAVLPCWSGLPAPWRWMKNEEVL 300 

Qy 303 — PKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPRD 359 

I : 1:1 II I hi 1:1:1 II I till II |: :| : 
Db 301 DTESSGRLVLLAGGCLEISDVTEDDAGTYFCIADNGNKTVEAQAELTVQVPPGFLKQPAN 360 

Qy 360 QWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQ 419 

: I'M I III:!: : | | : : :| : : 

Db 361 IYAHESMDIVFECEVTGKPTPTVKWVKNGDWI PSDNFKIVKEHNLQVLGLV 412 

Qy 420 RSDVGYYICQTLNVAGSIITKAYLEVT--DVIADRPPPVIRQGPVNQTVAVDGTFVLSCV 477 

:H l:| I ' I I: II: II II I I 

Db 413 KSDEGFYQC I AENDVGNAQAGAQLI ILEHDVAI PTLPP TSLTSATTDHLAP 463 

Qy 478 ATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGE-ATW 536 

II 1:1: ' UN I: II : II I |: I: 
Db 464 ATTGPLPSAPRDWASLVST RFIKL TWRTPASDPHGDNLTY 504 

Qy 537 SAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSYII 596 

I : = If : I I I : :|| 

Db 505 SVFYTKE- -GVDRERVENT SQPGEMQVT 530 

Qy 597 EAFSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQD 656 

I: I I :hl I I I :| I I |:|| 
Db 531 - IQNLMPATVY IFKVMAQNKHG - SGESSAPLRVETQP 565 

Qy 657 VLPTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHW-TVDQQSQYIQGYKILYRPSG 715 

: I: I : :l I II II I : II lh I I 
Db 566 EV -QLPGPAPNIRAYATSPT SITVTWETPLSGNGEIQNYKLYYMEKG 611 

Qy 716 ANHGESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAP 775 

: I I I- : :| I 1:1 | : : | :: :|| : | 
Db . 612 TDK-EQDIDV SSHSYTINGLKKYTEYSFRWAYNKHGPGVSTQDVAVRTLSDVP 664 

Qy 776 SAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHINKT-VDGST 834 

II II ::: ' I :|:: Mil Mil : II: : : :| I |: 

Db 665 SAAPQNLSLEVR--NSKSIVIHWQPPSSTTQNGQITGYKIRYRKASRKSDVTETLVTGTQ 722 

Qy 835 FSWIPFLVPGIRYSVEVAASTGAGSGVKSE PQFIQLDAHGNPV 878 

I :l I I I: III I 1:1 :: I: : I I: 

Db 723 LSQLIEGLDRG TEYNFRVAALTVNGTGPATDWLSAETFESDLDETRVPE - VPSSLHVRPL 781 

Qy 879 SPEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIWLYRHRKKRNGLT 929 

11:1 : II: I III: : I |: | 
Db 782 VTSIWSWTPPENQ NIWRGYAIGYGIGSPHAQTIKVD — YKQR 823 

Qy 930 STYAGIRKV-PSFTFTPTV-TYQRGGEAV — SSGGRP GLLNISEP-AAQPW 975 

II : II : I: : II : |: II I: I 

Db 824 -YYTIENLDPSSHYVITLKAFNNVGEGIPLYESAVTRPHTDTSEVDLFVINAPYTPVPD 881 

Qy 976 LADTWPNTG-.— -NNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNKQTNL 1030 

I I :|: I: :|::| : : I : I :|| 

Db 882 PTPMMPPVGVQASILSHDTIRITW ADNSLPKHQKITD- -SRYYTV- -RWKTN- 929 

Qy 1031 MLPESTVYGDVDLSNKINEMKTFNSPN LKDGRFVNPSGQPTPYATTQLIQSN 1082 

:| :'l I :-: : :: : I II : II : II :|: :: 

Db 930 •IPANTKYKNAN-ATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGATFELVPTS 987 

Qy 1083 LSNNMNNGSGDSGEK-— HWKPLGQQKQEV APVQYNIVEQ— NKLN 1123 

' :: I :': : :|:| : :: | : ;:| |:| 

Db 988 PPKDVTWSKEGKPRTIIVNWQPPSEANGKITGYIIYYSTDVNAEIHDWVIEPWGNRLT 1047 

Qy 1124 KDYRANDTVPPTIPYNQSYDQNTGG SYNSSDR-GSSTSGSQGHKKG 1168 
:: : M I : :|: I :|||: II I II 
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Db 1048 - -HQIQELTLDTPYYFKIQARNSKGKGPMSEAVQFRTPKADSSDKMPNDQALGSAG- -KG 1103 

Qy 1169 ARTPKVPKQGGMNWADLLPPPPAHPPPHSNSE — ETHI 1204 

:| I : :| II II : : |: 

Db 1104 SRLPDL GSDyKPPMSGSNSPHGSPTSPLDSNMLLVIIVSVGVITIVWWIAV 1156 

Qy 1205 SVDES YDQEMPC ■ PVPPARMYLQQDELEEEEDERGPTP * PVR 1244 

II: I: : I I I ::: : II : :: I I II 
Db 1157 PCTRRTTSHQKKKRAACKSVNGSHKYKGNCKDVKPPDLWIHHERLELKPIDKSPDPNPV- 1215 

Qy 1245 G AAS S PAAVS Y SHQST ATLT PS PQEELQPMLQDC PEETGHMQ HQPD 1290 

III ::: I : : : | : |: : 
Db 1216 MTDTPIPRNSQDITP-VDNSMDSNIHQRRNSYRGHESEDSMSTLAG 1260 

Qy 1291 -RRRQPVSPPP — PPRPISPPHTYGYISGPLVSDMDTDAPEEEEDEADMEVAKMQTRR 1345 

I : I 1 1 : 1 : | |: I I I : : 
Db 1261 RRGMRPKMMMPFDSQPPQPVISAH PIHS — LDNPHHHFHSSSL 1301 



STANDARD; PRT; 1443 AA. 



Qy 1346 LLLRGLEQTPA SSVGDLESSVTGSMINGWGSASEEDNISSGRSSVSSSDGSFF 1398 

:|| || : :|: |: : | I I : :|| : 
Db 1302 ASPARSHLYHPSSPWPIGTSM--SLSDRANSTESVRNTPSTDTMPASSSQTCC 1352 

»1399 TDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAAVMQKT 1458 
II : I :: I 1 hi I II: I 
Db 1353 TDHQDPEG - ATSSSY — LASSQEED SGQSLPTAHV 1384 



Qy 1459 RPAKKLKHQPGHLRRETYTDDLPPPPVPP- • -PAIKSPTAQSKTQLE 1502 

II: II :: III II II: I |: II 

Db 1385 RPSHPLK SFAVPAIPPPGPPLYDPALPSTPLLSQQALEPSTFHSVKTASIG 1435 

Qy 1503 VRPVWPKLPSMDARTDRSSDRKGSSYKGREV 1534 

Mill 1:11: III: h 
Db 1436 TLGRSRPPMPVWPSAPEVQ-ETTRMLEDSESSYEPDEL 1473 



RESULT 4 
NE01_CHICK 
ID NEOl.CHICK 
AC Q90610; 
DT 01-OCT-2000 (Rel. 40, Created) 
DT 01-OCT-2000 (Rel. 40, Last sequence update) 
DT 01-OCT-2000 (Rel. 40, Last annotation update) 
DE NEOGENIN (FRAGMENT). 
OS Gallus gallus (Chicken), 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC ■ Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 
OC Gallus. 
RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-WHITE LEGHORN; TISSUE-EMBRYONIC BRAIN; 
m MEDLINE-95105243; PubMed-7806578; 
HI Vielmetter J.,: Roman J.M., Dreyer W.J.; 
tTt "Neogenin, an ('avian cell surface protein expressed during terminal 
neuronal differentiation, is closely related to the human tumor 
suppressor molecule deleted in colorectal cancer."; 
J. Cell Biol. 127^009-2020(1994). 

•!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 
-!- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 
-!- DEVELOPMENTAL STAGE: IN RETINA, EXPRESSED ON GANGLION CELL FIBERS 

AS SOON AS THEY BEGIN TO EXTEND THEIR AXONS. 
-I- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN'LIKE C2-TYPE DOMAINS. 
-I- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 
-!- SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
TUMOR SUPPRESSOR PROTEIN DCC, 



This SWISS -PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation ■ 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed, Usage by and for commercial 



CC 


entities requires a license agreement (See http://www.isb-sib.ch/announce/ 


cc 


or send an email to licensefiisb-sib.ch). 




CC 


EMBL; U07644; AAC59662.1; 






DR 


HSSP; P80362; 1WTL. 






DR 


INTERPRO; 


IPR001777; -. 






DR 


INTERPRO; IPR003006; -. 






DR 


PFAM; PF00041; fn3; 6. 






DR 


PFAM; PF00047; ig; 4. 






KW 
FT 


Transmembrane; Immunoglobulin domain; Glycoprotein. 

MAM TPD 1 ■ 1 




FT 


DOMAIN 


<L 


1090 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


1091 • 


1111 


POTENTIAL, 




FT 


DOMAIN 


1112' 


1443 


CYTOPLASMIC (POTENTIAL) 




FT 


DOMAIN 


33; 


102 


IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


132! 


194 


IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


229' 


293 


IG-LIKE C2-TYPE DOMAIN. 




FT 


DOMAIN 


321; 


383 


IG-LIKE C2 : TYPE DOMAIN. 




FT 


DOMAIN 


422 


519 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


522 


615 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


616; 


714 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


720, 


814 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


835, 


935 


FIBRONECTIN TYPE-III. 




FT 


DOMAIN 


936 


1037 


FIBRONECTIN TYPE-III. 




FT 


DISULFID 


40 


95 


BY SIMILARITY. 




FT 


DISULFID 


139 


187 


BY SIMILARITY. 




FT 


DISULFID 


236' 


286 


BY SIMILARITY. 




FT 


DISULFID 


328. 


376 


BY SIMILARITY. 




FT 


CARBOHYD 


39. 


39 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


176 


176 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


292 


292 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


456 


456 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


475. 


475 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


625' 


625 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


700' 


700 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


894: 


894 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


SQ 


SEQUENCE 


1443 AA; 158050 MW; 558C6795579C0E26 CRC64; 



Query Match 8.7%; Score 757.5; DB 1; Length 1443; 

Best Local Similarity 23.2%; Pred. No. 1.3e-25; 

Matches 378; Conservative 207; Mismatches 681; Indels 361; Gaps 64; 

Qy 57 TGSRLRQEDFPP-RIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDP 115 

III :l II : I I:: :| :ll : I III I I : II 
Db 9 TGSWR-TFTPFYFLVEPMDILSVRGASVIMNCSSYCETPPKIEWKKDGTLLNLVSDD- 65 

Qy 116 RSHRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVAR-NYLGEAVSHNASLEVAILRDDFRQ 174 

I III IN :ll : ::llll I III II II I I II I I 

Db 66 - -RRQLLPDGSLLINSWHSKHNKPDEGYYQCVATVESLGSIVSRTAKLTVAGL- PRFTS 122 

Qy 175 NPSDVMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITIRGGKLMITYTRKSD 234 

I I \j I:: I: I : |::| II I : I hi :l 
Db 123 QPELSSVYKGNSAILNCE-VNVDLAPFVRWEQDRQPLSLDDRVFKLPSGALLIGNATDTD 181 

Qy 235 AGKYVCVGTNMVGERESEVAELTVLERPS FVKRPSNLAVTVDDSAEFKCEARGD 288 

I I II :' : II III M I ll::||:l :| I I I I 

Db 182 GGFYRCVIESGGTPKYSEEAELKILPDPEEPQSLVFVRQPSSLTKVTGQNAVFPCVAGGF 241 

Qy 289 PVPTVRWRKDDGEL- - -PKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATL 345 

I I III I:'/ II I: I :| I II 1 : 1 : 1 1 1 : 1 : 1 II I I 
Db 242 PTPYVRWTKNGEELITEDSERFALRAGGSLLISDVTEEDVGTYTCIADNENETIEAQAEL 301 

Qy 346 TVQEPPHFVVKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRF 405 

II II I: <:| : : Mil II I I : I : I : I I 

Db 302 AVQVPPEFLKP.PANIYAHESMDIVFECEVTGKPTPTVKWVKNGDWIPSDY F 353 

Qy 406 SVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRP--PPVIRQGPVN 463 

: : :l :' : :|l Mill: I I : |: I II I 
Db 354 KIVKEHNLQVLGLVKSDEGFYQCIAENDVGNAQAGAQLIILDLDVAIPTLPPTSLTSATN 413 

Qy 464 QTVAVDGTFVLSCVATGSPVPTILWRKD'GVLVSTQDSRIKQLENGVLQIRYAKLGDTGRY 523 
:| • , II hll llll I: M 



Best Available Copy 
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■ • PATTGPLPT APRDWATLVST - ■ 



524 TCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQP 583 



II 

■-Tra- 



il I: II I I ::! I : :| 
■TPVSDPQ--GDNLTYSIFYTKE--GINRERVENTSRP 479 



Qy 584 NLNSGATPTSYIIEAFSHASGSSWQTVAENVKTETSA-IKGLKPNAIYLPLVRAANAYGI 642 
II I: I I :hl MM 
) G } ETQVMIQNLMPETVYVFRWAQNKHGH 507 



Qy 643 

: I hi 
Db 508 GESSA—-PLK VATQPE r ' 



SDPSQISDPVKTQDVLPTSQGvDHKQVQRELGNAVLHLHNPTVLSSSSIEVHW-TVDQQS 7 01 
I: I : :|| I: I I I : 
■ -QLPGPAPNIRAYAGSPT SVTVTWETPLSGN 552 



:N : III II 
5 STQDVWRTLSDVPSAAPQNL 



Qy 702 QYIQGYKILYRPSGANHGESDftLVFEVRTPAKNSWIPDLRRGVNYEIKARPFFNEFQGA 761 

II II: I *| : II 
Db 553 GEIQNYKLYYMEKGQD-SEQDtDV- 



I I I I hi I : : I 

■AGLSYTITGLKKYTEYSFRWAYNKHGPGV 605 



762 DSEIKFAKTLEEAPSAPPOGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNE 821 



III! I :| : II: 
'LEAR- - NSKS IMLHWQPPPAGTHSGQ ITGYKIRYRKVS 663 



Qy 822 TRYHINKIVDGSTFSWIPFLVPGIRYSVEVAASTGAGSG VKSEPQFIQLD — 872 

: : ::| h :| l] I h :ll I hi I :l II 
Db 664 RKSDVTESVGGTQLFQLIEGLERGTEYNFRIAAMTVNGTGPATDWVSAETFESDLDESRV 723 



Qy 873 
Db 



■AHGNPV- 

I h 

724 PEVPSSLHVRPLVTSIWSHTtPENQ' 



■ 1PEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIW 917 
"11:1 : II: I llh : I 

NIWRGYAIGYG IGSPHAQT IKVD - - - 773 



918 LYRHRKKRNGLTSTYAGIRKV 

hi II 
774 -YKQR' 



PSFTFTPTV-TYQRGGEAV— -SSGGRP GL 964 
II : h : II : I: II I 
■YYTIENLDPSSHYVITLKAFNNVGEGIPLYESAVTRPHSDTSEVDL 823 



965 LNISEP-AAQPWLADTWPNTG 

hi I : I 



Db 824 FVINAPYTPVPDPSPMMPPVGVQASILSHDTIRITW- 



Qy 1019 YNNQLDNKQTNLMLPESTVYGfcVDLSN- 

I :|| :| :| T 
Db 875 YYTV--RWKTN- 



IPANTKYKf ANATTLSYLVTGLKPNTLYEFSVMVTKGRR' 



Qy 1071 TP YATT -QLIQSNLSNNMNNG 

I : II :| 

Db 929 



• -NNHNDCSISCCTAGNGNSDSNLTTYSRPADCIAN 1018 

:h h =h:| 



■■ADNSLPKNQKITD--AR 874 



KINEMKTFNSPNLKDGRFVNPSGQP 1070 
I I : I: II:: 

SSTWSM 928 



5GDSGEK-- ■ -HWKPLGQQKQEVAP-VQYNIVEQNKLNK 1124 

:hl : :: : I 



TAHGTTFELVPTSPPKDVTW|KEGKPRTIIVNWQPPSEANGKITGYIIYYSTDVNAEIH s 

•1125 DYRANDTVPPTIPYN-QSYDQjTTGGSYNSSDRGSSTSGSQGHKKGARTPKVPKQGGMNWA 1183 
!: I : : I I : I I I III! 

Db 989 DWVIEPWGNRLTHQIQELTL|)TPYYFRIQARNSKGMGPMSEAVQFRTPKAES S 1042 

Qy 1184 DLLPPP - PAHPPPHSNSEEYNISVDES YDQEMPCP 1217 

I :| M : II I ! : I II 

Db 1043 DRMPNDQASGSAGRGSRPVDVGPDYKPPLSGSNSPHGSPTSPLDSSMLLVIIVSVGVITI 1102 

Qy 1218 --VPPARMYLQQDELEEEEDERGPTPPVRGAASSPAAVSYSHQSTATLTPSPQEELQPML 1275 

I :: : :: :| II h :: : I : 

Db 1103 VI WI VAVFCT RRTT SHQKKKRAAC KS VNG SHKYKGNSKDVKPPDLWI — 1150 

Qy 1276 QDCPEETGHMQHQPDRRRQPVSPPPPPRPI---SPPHTYGYISGPLVSDMDTDAPE— - 1328 

I :h I I II :| h : lh: : 

Db 1151 HHERLELKPIDKSPDPNPIMTDTPIPRNSQDITPVDNSMDSNIHQRRNS 1199 

Qy 1329 EEEDEADMEVAKMQTRRLLLRGLEQTPASSV GDLES" 1364 

III : I :: : I I III 

Db 1200 YRGHESEDSMSTLAGRRGMRPKMMMPFDSQPPQPVISAHPIHSLDNPHHHFHSGSLASPT 1259 

Qy 1365 - S VTG SMINGW ■ -GSASEEDNI SSG RS SVSSSDGSFFTDADFAQAVAAAAE — YAGLKV 1418 

I :: I h: : :: II :: I I :| I : :| : 

Db 1260 RSYLHHQVSPWPVGTSMSHSDRANSTESVRNTPSSDTMPASSSQPCADHQDPDSSSGAYL 1319 

Qy 1419 ARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAAVMQKTRPAKKLKHQPGHLRRETYTD 1478 

I :MI :: I :: : I I I 

Db 1320 GSAQEEDAA QSLPTAHVRPSHPLKSFAVPAVPAAGSAYDP 1359 



1479 DLPPPPV* PPPAIKSPTAQSKTQLEVR- ■ ■ PVWPKLPSMDARTDRSSDRKGS 1527 

II h I ::| II I I lllll I : I I : I 

1360 TLPSTPLLTQQAPSHPVHSVK - - TASIGTLGRTRPPMPVWPSAPDVQ - ETTRMLEDSES 1416 

1528 SYKGREV 1534 

lh h ■ 

1417 SYEPDEL 1423 



RESULT 5 

DSCAJUMAN i 

ID DSCA_HUMAN STANDARD; PRT; 2012 AA. 

AC 060469; 060468; • 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rel. 40, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE DOWN SYNDROME CELL ADHESION MOLECULE PRECURSOR (CHD2) . 

GN DSCAM, 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Horainidae; Homo. 

RN [1] 

RP SEQUENCE FROM N.A., AND ALTERNATIVE SPLICING, 

RC TISSUE-BRAIN; . 

RX MEDLINE-98087574; PubMed-9426258; 

RA Yamakawa K. , Huot Y.-K., Haendelt M.A,, Hubert R., Chen X,-N. ( 

RA Lyons G.E., Korenberg J.R.; 

RT "DSCAM: a novel ;member of the immunoglobulin superfamily maps in a 

RT Down syndrpme region and is involved in the development of the 

RT nervous system,"; 

RL Hum. Mol. Genet. 7:227-237(1998).' 

RN [2] 

RP SEQUENCE FROM N.'A. 

RX MEDLINE-20289799; PubMed-10830953; 

RA Hattori M., Fujiyama A., Taylor T.D., Watanabe H,, Yada T., 

RA Park H.-S., Toyoda A., Ishii K, , Totoki Y. ( Choi D.-K., Soeda E, ( 

RA Ohki M., Takagi T., Sakaki Y., Taudien S., Blechschmidt K., Polley A. ( 

RA Menzel D., Delabar J., Kumpf K. , Lehmann R., Patterson D., 

RA Reichwald K. ( Rump A., Schillhabel M,, Schudy A., Zimmermann W,, 

RA Rosenthal A,, Kudoh J, , Shibuya K,, Kawasaki K., Asakawa S., 

RA Shintani A,, Sasaki T,, Nagamine K. ( Mitsuyama S., Antonarakis S.E., 

RA Minoshima s., Shimizu N., Nordsiek G., Hornischer K., Brandt P., 

RA Scharfe M. , Schben 0., Desario A., Reichelt J., Kauer G, , Bloecker H., 

RA Ramser J., Beck A., Klages S., Hennig S., Riesselmann L. ( Dagand E., 

RA Wehrmeyer S., Borzym K., Gardiner K. , Nizetic D., Francis F., 

RA Lehrach H., Reinhardt R., Yaspo M.-L.; 

RT "The DNA sequence of human chromosome 21."; 

RL Nature 405:311-319(2000). 

RN [3] 

RP FUNCTION. 

RA Agarwala K.L., Nakamura S., Tsutsumi Y., Yamakawa K, ; 

RT "Down syndrome cell adhesion molecule DSCAM mediates hemophilic 

RT intercellular adhesion , " ; 

RL Brain Res. Mol. -.Brain Res. 79:118-126(2000). 

CC -I- FUNCTION: CELL ADHESION MOLECULE THAT CAN MEDIATE CATION- 

CC INDEPENDENT .HOMOPHILIC BINDING ACTIVITY. COULD BE INVOLVED IN 

CC NERVOUS SYSTEM DEVELOPMENT. 

CC -I- SUBCELLULAR .LOCATION : TYPE I MEMBRANE PROTEIN (PROBABLE). THE 
CC SHORT ISOFORM MAY BE SECRETED. 

CC -I- ALTERNATIVE. PRODUCTS: 2 ISOFORMS; A LONG FORM/CHD2-52 (SHOWN HERE) 
CC AND A SHORT 'FORM/CHD2-42; ARE PRODUCED BY ALTERNATIVE SPLICING. 

CC -I- TISSUE SPECIFICITY: PRIMARILY EXPRESSED IN BRAIN. 

CC •!- SIMILARITY: , CONTAINS 10 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS . 

CC -I- SIMILARITY: (CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC -, 

CC This SWISS-PROT sentry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation ■ 

CC* the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license?isb-sib.ch). 
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us-09-540-245a-18.rsp 



Page 



cc 



EMBL; AF023450; AAC17967.1; 

DR EMBL; AF023449; AAC17966.1; -. 

DR EMBL; AL163283; CAB90464.1; -. 

DR EMBL; AL163282; CAB90436.1; ■. 

DR EMBL; AL163281; CAB90444.1; -. 

DR MIM; 602523; -. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 6. 

DR PFAM; PF00047; ig; 9. 

DR PRINTS; PR00014; FNTYPEIII. 

KW immunoglobulin domain; Glycoprotein; Signal; Cell adhesion; Repeat; 

KW Transmembrane; Alternative splicing. 



1 



17 



FT SIGNAL 

FT CHAIN 

FT DOMAIN 

FT TRANSMEM 1596 1616 



18 2012 
18 1595 



FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

R DOMAIN 
DOMAIN 
DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 

FT DISULFID 



1617 2012 



39 
138 
239 
328 
421 



109 
204 
300 
392 
491 



518 582 
610 676 



704 
802 
885 



773 
872 
972 



984 1076 
1088 1177 
1189 1273 
1300 1366 
1380 1463 
1477 1562 
46 

145 

246 

335 

428 

525 

617 

711 

809 



FT DISULFID 1307 1359 
FT CARBOHYD 28 28 



FT CARBOHYD 
FT , CARBOHYD 
FT ' CARBOHYD 
' CARBOHYD 
A CARBOHYD 
CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 
FT CARBOHYD 



78 
470 
487 
512 
556 
658 
666 
710 
748 
795 
924 



78 
470 
487 
512 
556 
658 
666 
710 
748 
795 
924 



FT CARBOHYD 1142 1142 
FT CARBOHYD 1160 1160 



1250 1250 
1271 1271 



FT CARBOHYD 1341 1341 
FT CARBOHYD 1488 1488 
FT VARSPLIC 1562 1571 
FT 

FT VARSPLIC 1572 2012 
FT CONFLICT 1893 2012 
FT 
FT 
FT 

SO. SEC 



POTENTIAL. 

DOWN SYNDROME CELL ADHESION MOLECULE. 
EXTRACELLULAR (POTENTIAL) . 
POTENTIAL. 

CYTOPLASMIC (POTENTIAL). 
IG-LIRE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIRE C2-TYPE DOMAIN, 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN, 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
IG-LIKE C2-TYPE DOMAIN , 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
BY SIMILARITY, 



BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY, 
BY SIMILARITY. ■ 
BY SIMILARITY. 
BY SIMILARITY. 
BY SIMILARITY. 
N-LINKED (GLCNAC. . 
N-LINKED (GLCNAC. . 
N-LINKED (GLCNAC. , 
N-LINKED (GLCNAC, , 
N-LINKED (GLCNAC. . 
N-LINKED (GLCNAC. , 
N-LINKED (GLCNAC. . 
N-LINKED (GLCNAC. . 
N-LINKED (GLCNAC. , 
N-LINKED (GLCNAC, . 
N-LINKED (GLCNAC. . 
N-LINKED (GLCNAC. , 
N-LINKED (GLCNAC. . 
N-LINKED (GLCNAC. , 
N-LINKED (GLCNAC. , 
N-LINKED (GLCNAC. , 
N-LINKED (GLCNAC. , 
N-LINKED (GLCNAC, , 
NFATLNYDGS -> KEAARCKEFS (IN SHORT 
ISOFORM) . 

MISSING (IN SHORT ISOFORM) . 
HRPGDLIHLPPYLRMDFLLNRGGPGTSRDLSLGQACLEPQK 
SRTLKRPTVLEPIPMEAASSASSTREGQSWQPGAVATLPQR 
EGAELGQAAKMSSSQESLLDSRGHLKGNNPYAKSYTLV ■> 
IGQVTSYICLHTLEWTFC (IN REF. 1). 
2012 AA; 222259 MW; 0E33CFB781A08334 CRC64; 



(POTENTIAL). 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL). 
(POTENTIAL) . 
(POTENTIAL) . 
(POTENTIAL), 
(POTENTIAL) . 
(POTENTIAL). 
(POTENTIAL) . 
(POTENTIAL) , 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL) . 
(POTENTIAL). 
(POTENTIAL) . 
(POTENTIAL) . 



Query Match : .; 8.7*; Score 755; db 1; Length 2012; 

Best Local Similarity 22.31; Pred. No. 2.6e-26; 

Matches 400; Conservative 245; Mismatches 686; Indels 464; Gaps 8 1 

Qy 64 EDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDP — RSHR 119 

II hi: : :|| II :| I ;| I III I I III III 

Db 403 EDGTPKIISAFSEKVVSPAEPVSLMCNVKGTPLPTITW TLDDDPILKGGSHR 454 

Qy 120 — MLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNP 176 

I: I:! : : M III I II I I : I : I I I 
Db 455 I SQMIT SEGNWSYLN I SS SQVR - DGGVYRCT ANNSAG - WLYQARI NV - - -RGPASIRP 509 

Qy 177 SDVMVAV-GEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITI-RGGKLMITYTRKS- 233 

: |: I i : |: hi :| I |: : I :: I I :: :| 
Db 510 MKNITAIAGRDTYIHCR-VIGYPYYSIKWYKNSNLLPFNHRQVAFENNGTLKLSDVQKEV 568 

Qy 234 DAGKYVCVGTNMVGERE---SEVAELTVLERPSFVKRPSNLAVTVDDSAEFKC-EARGDP 289 

I |:| I |:: : : I: :|| : I |:: :: I II 
Db 569 DEGEYTC— NVLVQPQLSTSQSVHVTV-KVPPFIQPFEFPRFSIGQRVFIPCVWSGDL 624 

Qy 290 VPTVRWRKDDGELPKSRYEIRDD — HTLKIRKVTAGDMGSYTCVAENMVGKAEASATL 345 

I: 1:11 ; :| I |: :|:| :: MIM I I : I 
|Db 625 PITITWQKDGRPIPGSLGVTIDNIDFTSSLRISNLSLMHNGNYTCIARNEAAAVEHQSQL 684 

tQy 346 TVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFS YQP 398 
1 I: II llhllll I: I I I I I I I I: II Ml 

Db 685 IVRVPPKFWQPRDQDGIYGKAVILNCSAEGYPVPTIVWK FSKGAGVPQFQP 736 

Qy 399 PQSSSRFSVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITKA-YLEVTDVIADRPPPVI 457 

: I 'III: I llhl: I |: ::|: III : I =1 
Db 737 IALNGRIQVLSNGSLLIKHWEEDSGYYLCKVSNDVGADVSKSMYLTV KIPAMI 790 

Qy 458 RQGPVNQTVAVDG-TFVLSCVATGSPVPTILWRKDGVLVSTQDSR-IKQLENG V 509 

I I |:|- I :|l I I : | |: ::: : :| : I I 
|Db 791 TSYP-NTTLATQGQKKEMSCTAHGEKPIIVRWEKEDRIINPEMARYLVSTKEVGEEVIST 849 

Qy 510 LQIRYAKLGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEV 569 

l!l 1:1 ::l III Mil I I : I: ' 

Db 850 LQILPTVREDSGFFSCHAINSYGEDRGIIQLTVQE PPDPPEIEI 893 

Qy 570 TDVSRNTVTLSWQPNLNSGATPTSYIIEAFSHASGSSWQTVAENVK TETSAIKGL 624 

II 1:111 : : I I II : II : |: I " I : 
Db 894 KDVKARTITLEWTMGFDGNSPITGYDIECKN-KSDSWDS-AQRTKDVSPQLNSATIIDI 950 

Qy 625 KPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQRELGNAVLHLHNPT 684 

I:: I M I I Ml ' '• I : =11 
Db 951 HPSST YS IRMYAKNRIGKSEPSN - ELT IT ADEAAPDGPPQE VHLE-" 994 

Qy 685 VLSSSSIEVHWTVDQ- - -QSQYIQGYKILYRPSGANHGESDWLVFEVRTPAKNSV-VIPD 740 

:|| II I I : I: MM II | : : I | : I : : 
Db 995 PISSQSIRVTVIKAPKKHLQNGIIRGYQIGYREYSTG-GNFQFNIISVDTSGDSEVYTLDN 1053 

Qy 741 LRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPPQGVTVSKNDGNGTAILVSWQP 800 

Ml:-: ||: |||: || ||: I : :| :|| 
Db 1054 LNKFTQYGLWQACNRAGTGPSSQEIITTTLEDVPSYPPENVQAIAT--SPESISISWST 1111 

Qy 801 PPEDTQNGMVQEYKV-WCLGNETRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTGA 858 

:: ||::|: M I : | : : |: : I ||::| I I I 
Db 1112 LSKEALNGILQGFRVIYWANLMDGELGEIKNITTTQPSLELDGLEKYTNYSIQVLAFTRA 1171 

Qy 859 6SG7KSEPQF1QL- -DAHGNPVSPEDQVSIAQQ ISD WKQPAFI AG IGAACWIILMVFS I 916 

I Ihll M III 1 1 : II M 

Db 1172 GDGVRSEQIFTRTKEDVPGPP AGVKAAAASASMVFVS 1208 

Qy 917 WLYRHRKKRNGLTSTYAGIRKVP SFTF- TPTVTYQRGGE AVSS 9:6 

II I II: I I Ih: I :: I M 

Db 1209 WL-PPLKLNGIIRKYTVFCSHPYPTVISEFEASPDSFSYRIPNLSRNRQYSVWWAVTS 1256 

Qy 959 GGRPG r LLNISEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGN- 1001 

II :l II: I M I: : 

Db 1267 AGRGNSSEIITVEPLAKAPARILTFSGTVTTPWM KDIVLPCKAVGDPSP 1315 



Qy 1002 SDSNLTTYSRPAD- 



•CIANYNNQLD 1024 



Best Available Copy 

Mon Jan 22 13:04:42 2001 us-Q9-54Q-245a-18,rsp 
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III I I MM I I 

Db 1316 AVKWMKDSNGTPSLVTIDGRRSIFSNGSFnRTVKAEDSGYYSCIANNNWGSDEIILNLQ 1375 

Qy 1025 NKQTNLMLPESTVYGD VDLSNKINEM KT 1052 

:| I: : I : II : I :l :: 

' Db 1376 VQVPPDQPRLTVSKTTSSSITLSWLPGDNGGSSIRGYILQYSEDNSEQWGSFPISPSERS 1435 



Qy 



Db 



Qy 



Oy 1053 FNSPNLKDGRF ■ 

: III I : 
Db 1436 YRLENLKCGTWYKFTLTAQN( 



m?S GQPTPYATTQLIQSNLSNNMNN" 1089 

I |: :: I : :::: 

rGPGRISEIIEAKTLGKEPQFSKEQELFASINTTRVRLN 1495 



Qy 1090 --GSGC 

, 

Db 1496 LIGWNDGGCPITSFTLEYRPI 



EKHWKPLJ5QQKQEVAPVQYNIVEQNKLNKDY 1126 

: |:| I 
■TTVWTTAQRTSLSKSYILYDLQEATWYELQM 1548 



Qy 1127 

Mb 1549 RVCNSAGCAEKQANFATLNYt|STIPPLI 

Wy 1167 

Db 1596 



DTVPPT IPYNQS YDQNTGGS YNSSDRGSSTSGSQGHK ■ 1166 
hll I :| II : I :| ::l I 

:— KSWQB EEGLTT-NEGLKM 1595 



LVT ISCILVGVLLLFVLLLV\ 1RRRREQRLKRLRDAKSLAEML ■ 



■ KG ART PKVPK - QGGMNWADLLPPPPAHPPPHSNSEEYN 1203 

I :: : : : |::| |: : 

■MSKNTRTSD 1647 

■ 1257 



Qy 1204 ISVDESYDQEMPCPVPPARM1 0QDELEEEEDERGP- TPPVRGAASSPAAVSYSH" 

: i I :l I::'::: : I hi III: ::: :| 
Db 1648 TLSKQQQTLRMHIDIPRAQLLIEERDTMETIDDRSTVLLTDADFGEAAKQKSLTVTHTVH 1707 

Qy 1258 -QSTATliTPSPQEELQPMLQDCPEETGHMQHQPDRRRQPVSPPPP PRP 1304 

• II : -1 I:: I I II : I II 

Db 1708 YQSVSQfe GPLVDVSDARPG- - -TNPTTRRNAKAGPTARNRYASQWTLNRPHP 1757 

Qy 1305 I S P PHTf G Y I SGPLVSDMDTDAP EEEEDEADMEVAKMQTRRLLLRGLEQTPASS 1358 

III I : I ::l I : :: I || 
Db 1758 TISAHT^ LTTDWRLPTPRAAGSVDKESDSYSVSPSQDTDR ARSS 1800 



Qy 1359 VGDLE--;SSVTGSMINGWGSASEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGL 1416 

: I ill : : | I: : : ::: I :| Mil 
Db 1801 MVSTES&STYEEIAMEHAKMEEQLRHAKFTITE- - "CFISDTSSEQLTAGTNEYIDS 1857 

1417 kvarrqmqdaagrrhfhasqcprptspvsteIskmsaa^qktrpakklkhqpghlrrety 1476 

: .': =1 III II j |: I lfc : II I I Ml : 

1858 ltsstpse - -sgicrftas — ppkpqdggrvmnma^>kahrpg* dlihlppylrmdfl 1910 
1477 tddlppppvpppaiksptaq-sktqlevrpvWpklpsmdartdrssdrkgssyk 1530 

: I : h = II 'I : :| |:| : II hi h: 

1911 lnrggpgtsrdlslgqaclepqksrtlkrptvlepip-meaassasstregqswqpgava 1969 



Wi 1531 — GREVLDGRQWDMRTNPGDPREAQEQQNDGKG- - RGNKAAKRDLPPAKTHLI 1580 
IM I I : :|l I :! :ll I lh: : 
Db 1970 tlpqregaelgqaakmss SQESLLDSRGHLKGNN PYAKSYTL 2011 



RESULT 6 
NEOl.RAT 

ID NEOl.RAT STANDARD; PRT; 1377 AA. 

AC P97603; 

DT ftl-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (ReMO, Last sequence update) 

DT Ol-OCT-2000 (Rel. 40, Last annotation update) 

DE NEOGENIN PRECURSOR (FRAGMENT). 

GN NEOl OR NGN. 

OS Rattus norvegicus (Rat). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=BRAIN; 

RX MEDLINE-97015074; PubMed-8861902; 

RA Keino-Masu K., Masu M,, Hinck L,, Leonardo E.D., Chan s.S.-Y., 

RA Culotti J.G., Tessier-Lavigne M.; 

RT "Deleted in Colorectal Cancer (DCC) encodes a netrin receptor."; 

RL Cell 87:175-185(1996) . 

CC ■!■ FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 



CC TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
CC DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
CC MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

CC -!• SUBCELLULAR. LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

CC •!• SIMILARITY :; CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC -I- SIMILARITY:. CONTAINS 6 FI3R0NECTIN TYPE III-LIKE DOMAINS. 

CC -!■ SIMILARITY : .' BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
CC TUMOR SUPPRESSOR PROTEIN DCC. 

cc 

CC This SWISS-PROT' entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://ww.isb-sib.ch/announce/ 

CC or send an email to licenseSisb-sib.ch). 

cc - 

DR EMBL; U68726; AAB41100.1; -. 

DR HSSP; P56276; 1TLK. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 6. 

DR PFAM; PF00047; ig; 4. 

Transmembrane; 'Immunoglobulin domain; Glycoprotein; Signal, 



FT 


NON.TER 


1. 


1 






FT 


SIGNAL 


<1, 


2 


POTENTIAL. 




FT 


CHAIN 


3, 


1377 


NEOGENIN, 




FT 


DOMAIN 


3 


1074 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


1075, 


1095 


POTENTIAL. 




FT 


DOMAIN 


1096 


1377 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


36 


105 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


135 


197 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


232. 


296 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


324 


386 


IG-LIKE C2-TYPE DOMAIN 




FT 


DOMAIN 


405 


502 


FIBRONECTIN TYPE- III, 




FT 


DOMAIN 


505- 


598 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


599 


698 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


704; 


798 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


819 


919 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


920- 


1021 


FIBRONECTIN TYPE- III. 




FT 


DOMAIN 


1087' 


1090 


POLY-VAL. 




FT 


CARBOHYD 


42 


42 


N- LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


179, 


179 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) , 


FT 


CARBOHYD 


295- 


295 


N-LINKED (GLCNAC, . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


439. 


439 


N-LINKED (GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


458. 


458 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


608; 


608 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


684; 


684 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


FT 


CARBOHYD 


878, 


878 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


so 


SEQUENCE 


1377- AA; 150637 MW; E514ED8ABD1A63A9 CRC64; 



Query Match > 8.6*; Score 746,5; DB 1; Length 1377; 

Best Local Similarity 22.6%; Pred. No. 3,8e-26; 

Matches 361; Conservative 213; Mismatches 636; Indels 385; Gaps 

Qy 67 PPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGS 126 

I : I I' : : Ml I |:| III I I : I I III II 

Db 21 PLYFLVEPVDTLSVRGSSVILNCSAYSEPSPNIEWKKDGTFLNLVSDD- - -RRQLLPDGS 77 

Qy 127 LFFLRIVHGRKSRPDEGVYVCVAR-NYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGE 185 

II :H : ::IHI I III : II II I I II I I I : II 

Db 78 LFISNWHSKHNKPDEGFYQCVATVDNLGTIVSRTAKLAVAGL-PRFTSQPEPSSIYVGN 136 

Qy 186 PAVMECQPPRGHPEPTISWKKDGSPLDDKDERITIRGGKLMITYTRKSDAGKYVCVGTNM 245 

:: |: • I : |::: II MM hh : I I I h : 
Db 137 SGILNCE-VNADLVPFVRWEQNRQPLLLDDRIVKLPSGTLVISNATEGDEGLYRCIVESG 195 

Qy 246 VGERESEVAELTVLERPS FVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDD 299 

: h ll'-l lh h llh: Ml I I II I :|| h: 

Db 196 GPPKFSDEAELKVLQESEEMLDLVFLMRPSSMIKVIGQSAVLPCVASGLPAPVIRWMKNE 255 

Qy 300 GEL---PKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWK 356 
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I I : :hl II I hi MM II I Mil II h : 
Db 256 DVLDTESSGRLALLAGGSLEISDVTEDDAGTYFCVADNGNKTIEAQAELTVQVPPEFLRQ 315 

Qy 357 PRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTIT 416 

I : : Ml II I I : I : I : I I : : :l : 

Db 316 PANIYARESMDIVFECEVTGKPAPTVKWVKNGDWIPSDY-- FRIVKEHNLQVL 367 

Qy 417 NVQRSDVGYYICQTLNVAGSIIIRAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSC 476 

: :ll 1:1 I I I: II -\ ■ I 
Db 368 GLVRSDEGFYQCIAENDVGNAQAGAQL — IILEHAP 401 

Qy 477 VATGSPVPTILWRRDGVLVSTQDSRIRQLENGVLQIRYARLGDTGRYTCIASTPSGE-AT 535 
> II 1:1: llll I: II : II II: I 

Db 402 -ATTGPLPSAPRDWASLVST RFIRL TWRTPASDPHGDNLT 441 

Qy 536 WSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSYI 595 

:| : : II : I I I : :|| 
Db 442 YSVFYTRE-GVARERVENT SQPGEMQVT 468 

Qy 596 IEAFSHASGSSWQTVAENVRTETSA1RGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQ 655 
I: I I :|:l I I I :| I I hi 

« 469 IQNLMPATVYIFRVMAQNRHG ■ SGESSAPLRVETQ 502 
656 DVLPTSQGVDHKQVQRELGNAV1HLHNPTVLSSSSIEVHW-TVDQQSQYIQGYRILYRPS 714 
: |: I : :|| HIM: II II: I 
Db 503 PEV QLPGPAPNIRAYATSPT S ITVTWETPLSGNGE IQNYKLYYMEK 548 

Qy 715 GANHGESDWLVFEVRTPMNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEA 774 

I : I I I : :| I 1:1 I : : I :: :|| : 
Db 549 GTDK-EQDVDV SSHSYTINGLKKYTEYSFRWAYNKHGPGVSTQDVAVRTLSDV 601 

Qy 775 PSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHINKT-VDGS 833 

III II ::: I :|:: llll llll : l|: : : :| I h 
Db 602 PSAAPQNLSLEVR-NSRSIVIHWQPPSSATQNGQITGYKIRYRKASRKSDVTETWTGT 659 

Qy 834 T FS WI PPLVPG I RYS VEVAASTGAGSGVKSE PQFIQLDAHGNP 877 

I :| I I I: III I hi I: : I I 

Db 660 QLSQLIEGLDRGTEYNFRVAALTVNGTGPATDWLSAETFESDLDESRVPE-VPSSLHVRP 718 

Qy 878 V SPEDQVSLAQQISDWRQPAFIAGIGAACWIILMVFSIWLYRHRKKRNGL 928 

: 11:1 : II: I III: : I hi 
Db 719 LVTSIWSWTPPENQ NIWRGYAIGYGIGSPHAQTIRVD — YRQR 761 

Qy 929 TSTYAGIRKV-PSFTFTPTV-TYQRGGEAV — SSGGRP GLLNISEP-AAQP 974 

I I : II : I: : II : h II I h I I 

Db 762 - - - Y YT I ENLDPSSHYV ITLKAFNNVGEG I PLYESAVT RPHTDT SEVDLFVI NAPYTPVP 818 

Qy 975 WLADTWPNTG NMNDCSISCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNRQTN 1029 

I I :h h :h:| : : I : I :|| 

C819 DPTPMMPPVGVQAS ILSHDT IRITW ADNSLPKHQKITD* "SRYYTV- -RWKTN 867 
1030 LMLPESTVYGDVDLSNRINEMKTFNSPN LRDGRFVNPSGQPTPYATTQLIQS 1081 
:| :| I ; : : :: ; I II : || : II :|: : 

Db 868 - - IPANTKYKNAN-ATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGATFELVPT 924 

Qy 1082 NLSNNMNNGSGDSGEK — HWKPLGQQKQEVAP-VQYNIVEQMKLMKDYRANDTVPPTI 1136 

: :: I : : :|:| : :: : I : I h I : 
Db 925 SPPKDVTWSKEGKPRTIIVNWQPPSEANGKITGYIIYYSTDVNAEIHDWVIEPWGNRL 984 

Qy 1137 PYN-QSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPRVPRQGGMNWADLLPPPPA — 1191 

: I :l : II I llll :| M I 

Db 985 THQIQELTLDTPYYFKIOARNSKGMGPMSEAVQFRTPKADS SDKMPNDQALGSA 1038 

Qy 1492 HPPPHSNSEEYNISVDESYDQEMPCPVPPA RMYLQQ 1227 

: II I I : I II:: :: : 

| Db 1039 GRGGRLPDLGSlfRPPMSGSNSPHGSPTSPLDSNMLLVIIVSIGVITIVVVVIIAVFCTR 1098 

Qy 1228 DELEEEEDERGPTPPVRGAASSPAAVSYSHQSTATLTPSPQEELQPMLQDCPEETGHMQH 1287 

:: :| I I h :: I : I 
Db 1099 RTTSHQKRRRAACRSVNG SHRYKGNCKDVRPPDLWI H 1135 

Qy 1288 QPDRRRQPVSPPPPPRPISPPHTYGYISGPLVSDMDTDAPEEEEDEADME- ■ -VAKMQTR 1344 

:|: III: II I =1 :: : : I 



Db 1136 HERLELKPIDKSPDPNPVM" 



■ -TDTPIPRNSQDITPVDNSMDSNIHQR 1180 



Qy 1345 RLLLRGLEQTPASSVGDLESSVTGSMINGWGSASEEDNISSGRSS- - -VSSSDGSFFTDA 1401 

I If I ;: I : |: : | : :: : |: :|| : || 
Db 1181 RNSYRGHESEDSMSTLAGRRGMRPRMMMPFDSQPPQQSVRNTPSTDTMPASSSQTCCTDH 1240 

Qy 1402 DFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAAVMQKTRPA 1461 

: I :: I :| I :| I lh I lh 
Db 1241 QDPEG - ATSS5Y LASSQEED SGQSLPTAHV RPS 1272 

Qy 1462 KRLRHQPGHLRRETYTDDLPPPPVPP— PAIKSPTAQSKTQLE 1502 

II '. :: III II lh I h I 
Db 1273 HPLK — - -SFAVPAIPPPGPPIYDPALPSTPLLSQQALNHHLHSVRIASIGTLGR 1323 

Qy 1503 —VRPVWPKLPSMDARTDRSSDRRGSSYKGREV 1534 

Mill: 1:1 I : III: h 
Db 1324 SRPPMPWVPSAPEVQEATRMLEDSE ■ SSYEPDEL 1357 



RESULT 7 
CAMLJAT 

ID CAML_RAT STANDARD; PRT; 1259 AA. 

AC Q05695; 

DT 01-FEB-1994 (Rel. 28, Created) 

DT 01-OCT-1994 (Rel. 30, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEURAL CELL ADHESION MOLECULE Ll PRECURSOR (N-CAM Ll) . 

GN LlCAM OR CAMLl . \ 

OS Rattus norvegicus (Rat). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-91372414; PubMed-1894011; 

RA Miura M., Robayashi M. , Asou H., Uyemura K. ; 

RT "Molecular cloning of cDNA encoding the rat neural cell adhesion 

RT molecule Ll. Two Ll isoforms in the cytoplasmic region are produced 

RT by differential splicing. "; 

RL FEBS Lett. 289:91-95(1991). 

CC -I- FUNCTION: CELL ADHESION MOLECULE WITH AN IMPORTANT ROLE IN THE 
CC DEVELOPMENT .OF THE NERVOUS SYSTEM. INVOLVED IN NEURON-NEURON 
CC ADHESION, NEURITE FASCICULATION, OUTGROWTH OF NEURITES, ETC, BINDS 
CC TO AXONIN ON NEURONS. 

CC -!■ SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

CC ■!• ALTERNATIVE PRODUCTS: TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 

CC PRODUCED BY, DIFFERENTIAL SPLICING. 

CC "I- TISSUE SPECIFICITY: THE SHORTER ISOFORM IS PREDOMINANTLY FOUND IN 
CC THE BRAIN, PILE THE LONGER ISOFORM IS FOUND IN THE PERIPHERAL 
CC NERVOUS SYSTEM. 

CC ■!• SIMILARITY:. CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC -!■ SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III -LIRE DOMAINS. 

cc : 

CC This SWISS-PROT' entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation • 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to licenseGisb-sib.ch). 

CC 

DR EMBL; X59149; CAA41860.1; -. 

DR PIR; S17655; S17655. 

DR HSSP; P20241; 1CFB. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 4. 

DR PFAM; PF00047; ig; 6.- 

DR PRINTS; PR00014; FNTYPEIII. 

RW Cell adhesion; Glycoprotein; Transmembrane; Repeat; Brain; 

RW Immunoglobulin domain; Signal; Alternative splicing. 

FT SIGNAL I' 19 BY SIMILARITY. 

FT CHAIN 20' 1259 NEURAL CELL ADHESION MOLECULE Ll. 

FT DOMAIN 20 : 1122 EXTRACELLULAR (POTENTIAL) , 



Best Available Copy 

Mon Jan 22 13:04:42 2001 us-09-540-245a-18.rsp ; Page , 



FT TRMSMEM 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT SITE 

FT SITE 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

•CARBOHYD 
CARBOHYD 
CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT VARSPLIC 

SO. SEQUENCE 



1123 1145 
1146 1259 



50 
150 
256 
346 
440 
531 
827 
932 



120 
215 
318 
410 
503 
599 
896 
994 



1032 1093 
553 555 



562 
100 
202 
246 
293 
432 
489 
504 
670 
725 
776 
824 



564 
100 
202 
246 
293 
432 
489 
504 
670 
725 
776 
824 



POTENTIAL. 

CYTOPLASMIC (POTENTIAL). 
IG-LIKE C2 -TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
FIBRONECTIN TYPE-III. 
CELL ATTACHMENT SITE (POTENTIAL) . 
CELL ATTACHMENT SITE (POTENTIAL), 



875 875 

968 968 

978 978 

1021 1021 i 

1029 1029 

1072 1072 i 

1106 1106 v 

1179 1182 

1259 AA; 14093 ■ 



Query Match 8,5* 
Best Local Similarity 23,71 
Matches 273; Conservative 



N-LINKED (GLCNAC. 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC, 

N-LINKED (GLCNAC, 

N-LINKED (GLCNAC, 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC, 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC, 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC. 

N-LINKED (GLCNAC. 
MISSING (IN SHORT ISOFORM) 
MW; 19681B022D8F24AB CRC64 



(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL), 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL), 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL). 
(POTENTIAL), 
(POTENTIAL). 
(POTENTIAL), 
(POTENTIAL). 



Score 739.5; DB 1; Length 1259; 
Pred. No. 7e-26; 
.59; Mismatches 435; Indels 287; Gaps 42; 



Qy 1 MKWKHVPFLVMISLLSLSPNHLFLAQLIPDPEDVERGNDHGTPIPTSDNDDNSLGYTGSR 60 

II :| |: I : III II 
Db ' .4 MLWYVLPLLLCSPCLLIQ IPDE YKGHH 30 

Qy 61 LRQEDFPPRIVEH-PSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDK 112 

* : II I I I 1:1 : :| |:| III III;: 

■) 31 VLE- - -PPVITEQSPRRLWFPTDDISLKCEARGRPQVEFRWTKDGIHFKPKEELGVWH 87 

Qy 113 DDPRSHRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDF 172 

: I I : : I I :|:| I I I II Ml I: :: : 

Db 88 EAPYSGSFTIEGNNSFAQRF QGIYRCYASNNLGTAMSH- - * -EIQLVAEGA 134 

Qy 173 RQNPSD — VMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITI-RGGKLMI 227 

: I : I I II I: I II II :lll::: : I I 

Db 135 PKWPKETVKPVEVEEGESVVLPCNPPPSAAPLRIYWMNSKILHIKQDERVSMGQNGDLYF 194 

Qy 228 TYTRKSD-AGKYVCVGTNMVGER---ESEVAELTVLERPSFVKRPSNLAVTVDDSAE— 280 

II 1:1 : I I : I :[ I ; I : |: 

Db 195 ANVLTSDNHSDYIC-NAHFPGTRTIIQKEPIDLRVKPTNSMIDRKPRLLFPTNSSSHLVA 253 

Qy 281 FKCEARGDPVPTVRWRKDDGELPKSRYEIRDDH-TLKIRKVTAGDMGSYTCV 331 

:| I I I I:: :| I I :| II:: I I I III: 
Db 254 LQGQSLILECIAEGFPTPTIKWLHPSDPMPTDRV-IYQNHNKTLQLLNVGEEDDGEYTCL 312 

Qy 332 AENMVGKAEASATLTVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQN 391 

' III :l I : :ll: I::: II: : II |: I Ml : II I 
Db 313 AENSI/3SARHAYYVTVEMPYWIiQKPQSHLYGPGETARU)CQVQGRPQPEVTWRING- - - 369 

Qy 392 LLFSYQPPQSSSRFSVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITKAYLEVTDVIAD 451 

I : :: : II I -III II |: I I :: lh I : 
Db 370 -MSIEKVNKDQKYRIEQ-GSLILSNVQPSDTMVTQCEARNQHGLLLANAYIYWQL— 423 

Qy 452 RPPPVIRQGPVNQT-VAVDG-TFVLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLENGV 509 



I :: : III : M = I I I I I hllh: I : III II 
424 -PARILTKD- - NQTYMAVEGSTAYLLCKAFGAPVPSVQWLDEEGTTVLQDERFFPYANGH 4 8 0 

510 LQIRYAKLGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPT 555 

I II : Hill II: II ::|:| I II I 
481 LGIRDLQANDTGRYFCQAANDQNNVTILANLQVKEATQITQGPRSTIEKKGARVTFTCQA 540 



556 --DPNL 559 

11:1 .'. 

541 SFDPSLQASITWRGDGRDLQERGDSDKYFIEDGQLVIKSLDYSDQGDYSCVASTELDEVE 600 

560 IPSAPSKPEVTD- - •VSRNTVTLSWQPNLNSGATPTSYIIE-AFSHASGSS 606 

J'. I::| : :: I III I : : III : 
601 SRAQLLWGSPGPVPHLELSDRHLLKQSQVHLSWSPAEDHNSPIEKYDIEFEDKEMAPEK 660 

607 WQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTSQGVDH 666 

I :: : :|| II I I I I I 1 1 ; 1 1 : 1 : I I : I II 
661 WFSLGKVPGNQTSTTLKLSPYVHYTFRVTAINKYGPGEPSPVSETWTPEAAPEKNPVDV 720 

667 KQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHGESDWLVF 726 

: III: : : I :| : II I:: :M I : I 

721 RGEGNETNNMVI TWKPLRW-MDWNAPQIQ- YRVQWRPLGK- - -QETW- - - 762 

727 EVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPP-QGVTV 784 

: :l : :]: : I Mil : |: :| : :: : |: I I : :|: 
763 KEQTVSDPFLWSNTSTFVPYEIKVQAVNNQGKGPEPQVTIGYSGEDYPQVSPELEDITI 822 

785 SKNDGNGTAILVSWQPPPEDTQNGMVQEYKV-WCLGNETRY---HINKT--VDGSTFS 836 

I : :|,l |:| I " I I I I" :: |::|: I :| I 

823 F NSSTVLVRWRPVDLAQVKGHLRGYNVTYWWKGSQRKHSKRHVHKSHMWPANTTS 878 



837 WIPFLVPGIRYSVEVAASTGAGSGVKSE PQFIQLDAHG 

:: I I 'I III I I I I II I: : h 

879 AILSGLRPYSSYHVEVQAFNGRGLGPASEWTFSTPEGVPGHPEALHLECQSDTSLLLHWQ 



875 



Qy 876 - NPVSPEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIWLY 919 

:|: I : I :ll 

Db 939 PPLSHNGVLTGYLLSYHPLDGESKEQLFFNLSD 971 

Qy 920 RHRKKRNGLTSTYAGIRKVPSFTF-TPTVTYQRGGEAV SSGGRPGLLNISEPA 971 

:.: II: :: : I |:| III: : |:| III I 
Db 972 -PELRTHNLTNLNPDLQ— -YRFQLQATTHQGPGEAIVREGGTMALFGKPDFGNISVTA 1026 

Qy 972 AQPWLADTW-PNTG 984 

: : :| I I 
Db 1027 GENYSWSWVPREG 1040 



RESULT 
NEOlJUMAN 



NEOl HUMAN STANDARD; PRT; 1461 AA. 
Q92859; 000340; ' 
01-OCT-2000 (Rei. 40, Created) 
01-OCT-2000 (ReL 40, Last sequence update) 
01-OCT-2000 (Rel. 40, Last annotation update) 



NEOl OR NGN. 

Homo sapiens (Human). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 
[1] 

SEQUENCE FROM N.A. (ISOFORMS 1 AND 2). 

TISSUE-FETAL BRAIN; 

MEDLINE-97236653; PubMed-9121761; 

Meyerhardt J, A,; Look A.T., Bigner S.H., Fearon E.R.; 

" Identif ication.' and characterization of neogenin, a DCC-related 

gene."; 

Oncogene 14:1129-1136(1997). 
[2] 

SEQUENCE FROM N.A. (ISOFORMS 1 AND 2). 
TISSUE-FETAL BRAIN; 
MEDLINE-97312699; PubMed-9169140; 

Vielmetter J., Chen X.-N., Miskevich p., Lane R.P., Yamakawa K. , 
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Korenberg J.R., Dreyer W.J.; 

"Molecular characterization of human neogenin, a DCC-related protein, 
and the mapping of its gene (NEOl) to chromosomal position 15q22,3- 
q23."; ' 
Genomics 41:414-421(1997) . } 

■!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

-!- SUBCELLULAR LOCATION: IYPE I INTEGRAL MEMBRANE PROTEIN. 

•!• ALTERNATIVE PRODUCTS: ST LEAST 2 ISOFORMS; 1 (SHOWN HERE) AND 2; 
ARE PRODUCED BY ALTERNATIVE SPLICING. 

-!- TISSUE SPECIFICITY: WIDELY EXPRESSED AND ALSO IN CANCER CELL 
LINES. 

■!* SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
•!- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIK DOMAINS. 
•!• SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
TUMOR SUPPRESSOR PROTE N DCC. 



This SWISS -PROT entry is 
between the Swiss Institute 
the European Bioinformatic 
use by non-profit 
modified and this statement 
entities requires a licensi 
or send an email to 



copyright. It is produced through a collaboration 
of Bioinformatics and the EMBL outstation - 
Institute. There are no restrictions on its 
itions as long as its content is in no way 
is not removed. Usage by and for commercial 
agreement (See http://www.isb-sib.ch/announce/ 
>isb-sib,ch). 



EMBL; U61262; AAB17263.1, 
EMBL; 572391; AAC51287.1, 
MIM; 601907; -. 
HSSP; P02751; 1TTG. 
rNTERPRO; IPR001777; -. 
INTERPRO; IPROO3O06; -. 
PFAM; PF00041; fn3; 6. 
PFAM; PF00047; ig; 4. 
PRINTS; PR00014; FNTYPEIII 

Transmembrane; Immunoglobu! 
Alternative splicing. 
SIGNAL 
CHAIN 
DOMAIN 
TRANSMEM 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DISULFID 
DISULFID 
DISULFID 
DISULFID 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 
VARSPLIC 
CONFLICT 



in domain; Glycoprotein; Signal; 



1 


33 


POTENTIAL, 




34 


1461 


NEOGENIN. 




34 


1105 


EXTRACELLULAR (POTENTIAL). 


1106 


1126 < 


POTENTIAL. 




1127 


1461 . 


CYTOPLASMIC (POTENTIAL), 


67 


136 


IG-LIRE C2-TYPE DOMAIN. 




166 


228 


IG-LIKE C2-TYPE DOMAIN, 




263 


327 


IG-LIKE C2-TYPE DOMAIN 




355 


417 


IG-LIKE C2-TYPE DOMAIN 




436 


533 


FIBRONECTIN TYPE-III, 




536 


629 


FIBRONECTIN TYPE-III. 




630 


729 


FIBRONECTIN TYPE-III. 




735 


829 


FIBRONECTIN TYPE-III. 




850 


950 


FIBRONECTIN TYPE-III. 




951 


1052 


FIBRONECTIN TYPE-III. 




1118 


1121 


POLY-VAL. 




74 


129 


BY SIMILARITY. 




173 


221 


BY SIMILARITY. 




270 


320 


BY SIMILARITY. 




362 


410 


BY SIMILARITY. 




73 


73 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


210 


210 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


326 


326 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


470 


470 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


489 


489 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


639 


639 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


715 


715 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


909 


909 


N-LINKED (GLCNAC. . .) 


(POTENTIAL). 


1248 


1300 


MISSING (IN ISOFORM 2) 




168 


168 


G -> N (IN REF, 2). 




1461 AA; 159958 MW; 7AAE897E69635A21 CB 


C64; 



Query Match 8.34; Score 725.5; DB 1; Length 1461; 

Best Local Similarity 22,4%; Pred. No. 3.6e-25; 



Matches 370; Conservative 



Mismatches 619; Indels 457; Gaps 



Qy 70 1VEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSLFF 129 

:|| I I : :l III I 1:1 Mill: II I III Nil 
Db 56 LVE-PVDTLSVRGSSVILNCSAYSEPSPKIEWKKDGTFLNLVSDD- • -RRQLLPDGSLFI 111 

Qy 130 LRIVHGRKSRPDEGVYVCVAR-NYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPAV 188 

=11 : ::llll I III II :| I I II I I I III: 
Db 112 SNWHSKHNKPDEGYYQCVATVESLGTIISRTAKLIVAGL-PRFTSQPEPSSVYAGNGAI 170 

Qy 189 MECQPPRGHPEPTISWKKDGSPLDDKDERITIRGGKLMITYTRKSDAGKYVCVGTNMVGE 248 

: I: I : I::: II I I : I hi: : I II II : 
Db 171 LNCE-VNADLVPFVRWEQNRQPLLLDDRVIKLPSGMLVISNATEGDGGLYRCWESGGPP 229 

Qy 249 RESEVAELTVLERPS FVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDGEL 302 

: I: II II I hhll I : I I I I ll::l I:: I 
Db 230 KYSDEVELKVLPDPEVISDLVFLKQPSPLVRVIGQDWLPCVASGLPTPTIKWMKNEEAL 289 

Qy 303 — PKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPRD 359 

I : :|:| II I |:| |:|:| II I INI I |: :| : 
Db 290 DTESSERLVLLAGGSLEISDVTEDDAGTYFCIADNGNETIEAQAELTVQAQPEFLKQPTN 349 

Qy 360 QWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQ 419 

: hll III I : I : I : I I : : :| : : 

Db 350 IYAHESMDIVFECEVTGKPTPTVKWVKNGDMVIPSDY FKIVKEHNLQVLGLV 401 

Qy 420 RSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVAT 479 

:|l hi I I I: I I :| : I II 
Db 402 KSDEGFYQCIAENDVGNAQAGAQL IILEHAP AT 434 

Qy 480 GSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGE'ATWSA 538 

:|: Mil I: II : II I |: |:| 
Db 435 TGPLPSAPRDWASLVST RFIKL TWRTPASDPHGDNLTYSV 475 

Qy 539 YIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEA 598 

: : I: : I I I : :|| 
Db 476 FYTKE--GIARERVENT SHPGEMQVT 499 

Qy 599 FSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVL 658 

I: II :hl I I I :| I I Ml : 
Db 500 - IQNLMPATVYIFRVMAQNKHG - SGESSAPLRVETQPEV 536 

Qy 659 PTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHW-TVDQQSQYIQGYKILYRPSGAN 717 

I: I : :|| II I I I : II II: I I : 
Db 537 QLPGPAPNLRAYAASPT SITVTWETPVSGNGEIQNYKLYYMEKGTD 582 

Qy 718 HGESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSA 777 

ll' 'I • : :| I hi ,1 : : I : :|| : III 
Db 583 K-EQDVDV-----SSHSYTINGLKKYTEYSFRVVAYNKHGPGVSTPDVAVRTLSDVPSA 635 

Qy 778 PPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHINKT-VDGSTFS 836 

II ::: •' I :|:: llll Mil : lh : : :l I h I 
Db 636 APQNLSLEVR--NSKSIMIHWQPPAPATQNGQITGYKIRYRKASRKSDVTETLVSGTQLS 693 

Qy 837 WIPFLVPGIRYSVEVAASTGAGSGVKSE PQFIQLDAHGNPV-- 878 

:| I h h III I hi :: h : I h 

Db 694 QLIEGLDRGTEYNFRVAALTINGTGPATDWLSAETFESDLDETRVPE-VPSSLHVRPLVT 752 

Qy 879 SPEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIWLYRHRKKRNGLTST 931 

Ih'l : lh I III: : I hi 
Db 753 SIWSWTPPENQ NIWRGYAIGYGIGSPHAQTIKVD YKQR 792 

Qy 932 YAGIRKV-PSFTFTPTV-TYQRGGEAV--SSGGRP GLLNISEP-AAQPWLA 977 

I I : II : I: : II : h II I h I I 

Db 793 YYTIENLDPSSHYVITLKAFNNVGEGIPLYESAVTRPHTDTSEVDLFVINAPYTPVPDPT 852 

Qy 978 DTWPNTG- - - - -NNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNKQTNLML 1032 

II . :h • h :|::| : : I : I :|| : 

Db 853 PMMPPVGVQASILSHDTIRITW ADNSLPKHQKITD--SRYYTV-RWKTN--I 899 

Qy 1033 PESTVYGDVDLSN KINEMKTFNSPNLKDGRFVNPSGQPTPYATT-QLIQSNL 1083 

1:11::: I I : h I I : : I : II :h :: 

Db 900 PANTKYKNANATTLSYLVTGLKPNTLYEFSVMVTKGRR-SSTWSMTAHGTTFELVPTSP 957 



* Best Available Copy 

Mon Jan 22 13:04:42 2001 us-09-540-245a-18.rsp Page 13 



Qy 1084 SNNMNNGSGDSGEK HWKPLGQQKQEV APVQYNIVEQ—NRLNK 1124 

:: I : I :|:j : :: I : ::| |:| 

[Db 958 PKDVTVVSKEGRPKTIIVNWQfrSEMGKITGYIIYYSTDVNAEIHDWVIEPVVGNRLI- 1016 

Qy 1125 DYRANDTVPPTIPYNQSYDQn|gG SYNSSDR- "GSSTSGSQGHKKGA 1169 

: I I : :l| I :|||: III I ||: 

Db 1017 -HQIQELTLDTPYYFKIQARNSKGMGPMSEAVQFRTPKADSSDKMPNDQASGSGG--RGS 1073 

Qy 1170 rtpkvpkqggmnwadllppppAhppphsnseeynisvdesydqempcpvppa 1221 

II: :| II II I .11:: 

Db 1074 RLPDL GSDYKPPMSGSNSPHG SPTSPLDSNMLLVIIVSVGVITIW 1119 



Qy 1222 — RMYLQQDElEEEEDERGfTPPVRGAASSPAAVSYSHQSTATLTPSPQEELQPMLQD 1277 
II I: :: : I 
1120 WIIAVFCTRRTTSHQKKKRAj iCKSVNG SHKYKGNSRDVRPPDLWI 1165 

«1278 CPEETGHMQHQPDRRRQPVS1 
1166 HHERLELKPID1 



'PPPRPISPPHTYGYISGPLVSDMDTDAPEEEEDEADME 1337 
Mill II: 
>K$PDPNPIM TDTPIPRNSQD 1196 



Qy 1338 VAKMQTRRLLLRGLEQTPASS fGDLESSVTGSMINGWGSASEEDNIS- -SGRSSVSSSDG 1395 
: " : l:| :| I ll::| =11 : 

•-ESEDSMSTLAGRRGMRPKMM 1238 



1197 ■ 



■-ITPVDNSMDSH EQRRNSYRGH-- 



SFFTDADFAQAVAAAAEYAGI [VARRQMQDAAGRRHFHASQCPRPTS ■ 



1396 

II: I I :| 
1239 MPF-DSQPPQPVISAHPIHSll)' 



1446 TDSNMS-i 

I "I I : I 

1289 



AAVMQKTRPAKKLjHQPGHLRRET YTD ■ ■ 

I : II 



1503 

Mill : I 
1409 RPPMPVWPSAPEVQ - ETTRMjiEDSESSYEPDEL 1441 



RESULT 9 
NRCA_CHICK 
ID 



STANDARD; 



-PVS 1445 
II: | |: 
NPHHHFHSSSLASPARSHLYHPGSPWPIG 1288 



1481 P PPfVPP- - -PAIKSPTAQSKTQLE-- 

II II: I I: I 
1349 PTAHVRPSHPLKSFAVPAIPE 'GPPIYDPALPSTPLLSQQALNHHIHSVKTASIGTLGRS 



DL 1480 

I 

1348 
1502 
1408 



VRPVWPRLPSMDARTDRSpDRKGSSYRGREV 1534 

Ml: I: 



PRT; 1284 AA, 



NRCA.CHICK 

IP35331; 

• 01-FEB-1994 (Rel. 28 ; Created) 
01-FEB-1994 (Rel. 28, Last sequence update) 

DT 15-JUL-1999 (Rel. 38, Last annotation update) 

DE NG-CAM RELATED CELL ADHES1 IN MOLECULE PRECURSOR (NR-CAM) (BRAVO) . 

OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus. 

RN [1] 

RP SEQUENCE FROM N.A., AND SEQUENCE OF 25-52; 178-184 AND 581-594. 

RC STRAIN-WHITE LEGHORN; TISSUE=EMBRYONIC BRAIN; 

RX MEDLINE-91258407; PubMed-2t)45418; 

RA Grumet M., Mauro v., Burgoon M.P., Edelman G.M., Cunningham B.A.; 

RT "structure of a new nervous system glycoprotein, Nr-CAM, and its 

RT relationship to subgroups of neural cell adhesion molecules."; 

RL J. Cell Biol. 113:1399-1412(1991). 

RN [2] 

RP SEQUENCE OF 25-1284 FROM N.A., AND PARTIAL SEQUENCE. 

RC TISSUE=EMBRYONIC BRAIN, AND RETINA; 

RX MEDLINE-92381110; PubMed-1512296; 

RA Rayyem J.F., Roman J.M., de la Rosa E.J., Schwarz U., Dreyer W.J.; 

RT "Bravo/Nr-CAM is closely related to the cell adhesion molecules LI 

RT ind Ng-CAM and has a similar heterodimer structure."; 

RL J. Cell Biol. 118:1259-1270(1992). 

CC -I- FUNCTION: THfi PROTEIN IS A CELL ADHESION MOLECULE INVOLVED IN 

CC NEURON-NEURON ADHESION! NEURITS FASCICULATION, OUTGROWTH OF 

CC NEURITES, ETC. SPECIFICALLY INVOLVED IN THE DEVELOPMENT OF OPTIC 





FIBRI 


S IN THE RETINA. 




:c 


-!■ SUBUNIT: HETE 


RODIMER, C 


OMPOSED OF AN ALPHA AND A BETA CHAIN. 


CC 


-!■ SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 


2C 


-!- ALTERNATIVE PRODUCTS: AT LEAST 5 ISOFORMS ARE PRODUCED BY 


IC 


ALTERNATIVE- SPLICING. 




CC 


-1- TISSUE SPECIFICITY: RETINA AND DEVELOPING BRAIN, 


-C 


-!- DEVELOPMENTAL STAGE: EXPRESSED IN DEVELOPING NEURAL RETINA AND 


CC 


EMBRYONIC .BRAIN TISSUE. 




CC 


■I" SIMILARITY: CONTAINS 6 


IMMUNOGLOBULIN- LIRE C2-TYPE DOMAINS. 


CC 
CC 


-!■ SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS, 




This SWISS -PROT entry is copyright. It is produced through a collaboration 


'C 


between 


the Swis 


s Institute of Bioinformatics and the EMBL outstation - 


2C 


the Euroj 


ean Bioinformatics Institute. There are no restrictions on its 




use by 


non-profit institutions as long as its content is in no way 




modified and this statement is not removed, Usage by and for commercial 


-C 


entities requires a license agreement (See http://www.isb-sib.ch/announce/ 




or send an emai.li to license@isb-sib.ch), 


CC 
3R 


EMBL; X58482; CAA41391.1; ■ 




\! 


EMBL; L08960; AAA48632.1; - 






HSSP; P20241; 1CFB. 






INTERPRO; IPR001777; -. 






INTERPRO; IPR003006; -. 




31 


PFAM; PF00041; fn3; 5. 




)\ 


PFAM; PF00047; ig; 6. 




3 \ 


PRINTS; PR00014;' 


FNTYPEIII. 




I J 


Immunoglobulin domain; Glycoprotein; Signal; Cell adhesion; Repeat; 


! J 


Transmem! 


rane; Alternative splicing. 


: r 


SIGNAL 


1 


24 




i|T 


CHAIN 


11 ' 




NG-CAM RELATED CELL ADHESION MOLECULE, 


ST 


DUMA1N 


25 


1143 


EXTRACELLULAR (POTENTIAL). 




TRANSMEM 


1144 ' 


1166 


POTENTIAL, 




DOMAIN 


1167 ' 


1284 


CYTOPLASMIC (POTENTIAL) . 


jj 


JUMAIN 


56 


125 


IG-LIKE C2-TYPE DOMAIN. 






155 - 


220 


IG-LIKE C2-TYPE DOMAIN. 


?T 


DOMAIN 


261 ' 


323 


IG-LIKE C2-TYPE DOMAIN. 




DOMAIN 


351. 


415 


IG-LIKE C2-TYPE DOMAIN. 




DOMAIN 


445 ' 


508 


IG-LIKE C2-TYPE DOMAIN. 


'1 


DOMAIN 


536 ' 


599 


IG-LIKE C2-TYPE DOMAIN, 


'1 


DOMAIN 


638 » 


699 


FIBRONECTIN TYPE-III. 




DOMAIN 


738 


799 


FIBRONECTIN TYPE-III. 


?T 


DOMAIN 


837 


906 


FIBRONECTIN TYPE-III. 


?T 


DOMAIN 


943 , 


1006 


FIBRONECTIN TYPE-III. 


?T 


DOMAIN 


1057 


1114 


FIBRONECTIN TYPE-III. 


?T 


DISULFID 


63; 


118 


POTENTIAL. 


?T 


DISULFID 


162' 


213 


POTENTIAL. 


?T 


DISULFID 


268' 


316 


POTENTIAL. 


?T 


DISULFID 


358 


408 


POTENTIAL. 


i 1 ! 


DISULFID 


452 


501 


POTENTIAL. 


?T 


DISULFID 


543 ; 


592 


POTENTIAL. 


'1 


CARBOHYD 


78' 


78 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


L 


CARBOHYD 


218' 


218 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


\l 


CARBOHYD 


290 ; 


290 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




CARBOHYD 


409', 


409 


N-LINKED (GLCNAC. . .) (POTENTIAL), 




CARBOHYD 


483/ 


483 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




CARBOHYD 


576 ' 


576 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


?T 


CARBOHYD 


581 


581 


N-LINKED (GLCNAC. , .) (POTENTIAL). 




CARBOHYD 


595 . 


595 


N-LINKED (GLCNAC, , .) (POTENTIAL) . 


?T 


CARBOHYD 


692. ■ 


692 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


'1 


CARBOHYD 


778 ' 


778 


N-LINKED (GLCNAC. , .) (POTENTIAL), 




CARBOHYD 


834 : 


834 


N-LINKED (GLCNAC. , .) (POTENTIAL) . 


™ 


CARBOHYD 


885 


885 


N-LINKED (GLCNAC. . .) (POTENTIAL), 




CARBOHYD 


969 ' 


969 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


985.' 


985 


N-LINKED (GLCNAC, , .) (POTENTIAL). 


FT 


CARBOHYD 


995 • 


995 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


1048 ; 


1048 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


1059 . 


1059 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


1091 ; 


1091 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


VARSPLIC 


612' 


621 


MISSING (IN ISOFORM AS10). 


FT 


VARSPLIC 


1027 


1038 


MISSING (IN ISOFORM AS12). 


FT 


VARSPLIC 


1039' 


1131 


MISSING (IN ISOFORM AS93), 


FT 


VARSPLIC 


1202 ; 


1205 


MISSING (IN ISOFORM AS-CYT2). 



Best Available Copy 
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CONFLICT 209 209 ' V -> E (IN REF. 2). 
CONFLICT 680 680 H -> Q (IN REF, 2). 

1284 AA; 141851 MT; ■ A3570BF9C3D47A0F CRC64; 



Query Match 8.0%; Score 701.5; DB 1; Length 1284; 

Best Local Similarity 23.2tjj Pred. No. 3.5e-24; 

Matches 302; Conservative 188; Mismatches 513; Indels 299; Gaps 51; 

Qy 59 SRLRQE-DFPPRIVEH-PSDLIVSRGEPATLNCRAEGRPTPTIEWYKGGERVETDKDDPR 116 

1:1 :| I : |:|:|:| h I : I : III 

Db 31 SRLLEELSQPPTITQQSPRDY VDPRENIVIQCEARGKPPPSFSWTRNGTHFDIDKD— 87 



fib 



Db 



SHRMLLPSGSLFFLRIVHGRKSRPDEGVTVCVARNYLGEAVSHNASLEVAILRDDFRQNP 



Qy 117 

Db 88 AQVTMKPNSGTLWNIMNGVKiEAyEGVYQCTARNERGAAISNNIVIRPSRSPLWTKEKL 



|::| I: 



Qy 177 SDVMVAVGEPAVMECQPPRGH EPTISWKKDGSPLDDKDERITIRG--GKLMITYTRRSD 234 

I : : II:: :| I I : : I 
Db 148 EPNHVREGDSLVLNCRPPVGIi PPIIFWMDNAFQRLPQSERVS-QGLNGDLYFSNVQPED 206 



235 AG - KYVCVG TNMVGERE 

1:1 I : 
207 TRVDYICYARFNHTQT IQQKQ 



Qy 282 KCEARGDPVPTVRWRKDDGEL RSRYEIRD-DHTLRIRRVTAGDMGSYTCVAENMVGRAE 340 

:| I M I ill h III :l : Mil h I hi M I M 
Db 267 ECIAAGLPTPVIRWIREGGEL1 ANRTFFENFRRTLRIIDVSEADSGNYRCTARNTLGSTH 326 



Qy 341 A 
Db 327 \ 



::lh I:::- 11= I* 



Qy 401 SSSRFSVSQTGDLTI-' 

II III: II 
Db 383 DPSR---1 



Qy 460 GPVNQ- 

I I: I I 

Db 435 -PANKLYQVIADSPALIDCAYlGSPKPEIEWFR' 



Qy 517 LGDTGRYTCIASTPSGEATWS YIEVQEFGVPVQPPR' 



II HIM I 
493 KDSTGTYTCVARNKLGKTQN 



^ 561 P-- 

I 

Db 553 E 



llll I III I 1:1:1 



L TVLERPSFVKRP SNLAVTVDDSAEF 281 

: Ml : I II : 
ISVKVFSTKPVTERPPVLLTPMGSTSNKVELRGNVLLL 266 



ASATLTVQEPPHFWRPRDQV ALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQ 



llll MM: | | 



HVI SVT VKAAPYWITAPRNLV SPGEDGTLICRANGNPRPSISWLTNGVPIAI- ■ 



400 

I: 

APE 382 



■TNVQR DVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQ 459 
I I :: I:: :|:|: II :: 

NVLAE-PPRILT- 434 



KVDGDTIIFSAVQE SSAVYQCNASNEYGYLLANAFV* - 
TVAVDGTFVLSCVA GSPVPTILWRRDGVLVS- 



'TQDSRIKQLENGVLQIRYAK 516 
II I I I : II I : : Ml IM I: 

: - GVKGS ILRGNEYVFHDNGTLE I PVAQ 492 



■ PTDPNLI 560 

lh: : :: |: II II 

ILEVKDPTMIIKQPQYKVIQRSAQASFECVIKHDPTLI 552 



SA 563 

:| 

612 



PTVIWLKDNNELPDDERFLVG! DNLTIMNVTDRDDGTYTCIVNTTLDSVSASAVLTWAA 

564 PSRP EVTDVSRNTVTLSWQPNLNSGATPTSYIIEAFS'-HASGSSWQ 608 

I I IM :: llll : : MMI I I I 

613 PPTPAHYARPNPPLDLELTGQLERSIELSWVPGEENNSPITNFVIEYEDGLHEPG-VWH 671 

609 TVAENVRTETSAIRGLRPNAIYLFLVRAANAYGISDPSQISDPVRTQDVLPTSQGVDHRQ 668 

1:1: I I I I I I II Ml: I: I: I : : 
672 YQTEVPGSHTTVQLKLSPYVNYSFRVIAVNEIGRSQPSEPSEQYLTKSANPDENPSNVQG 731 

669 VQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHGESDWLVFEV 728 

: I I I: : I: : 11= =1 : =1 
732 IGSEPDNLVITWESLRGFQSNGPGLQ YRVSWRQRDV— DDEW 771 

729 RTPAKNSWIPDLRK GVNYEIRARPFFNEFQGADSEIRFARTLEEAPSAPPQ 780 

III: :: I I llll : : : MM I 

772 TSWVANVSKYIVSGTPTFVPYEIKVQALNDLGYAPEPSEVIGHSGEDLPMVAPG 826 

781 GVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVW CLGNETRYHINK — TVDG 832 

M : II I I I I : I :| III: I :: |: I II 
827 NVQV--HVINSTLAKVHWDPVPLKSVRGHLQGYKVYYWKVQSLSRRSKRHVEKKILTFRG 884 



Qy 833 STFSWIPFLVPGIRYSVEVAASTGAG SGVKSEPQFIQ LDA- 873 

: ::| I I I M I I II I I M II: 

Db ,885 NKTFGMLPGLEPYSSYRLNVRVVNGKGEGPASPDKVFKTPEGVPSPPSFLKITNPTLDSL 944 



Qy 874 • • -HGNPVSPE — DQVSLAQQISDV VRQPAFI AG IGAACWI ILMVFS IWL 918 

1:1 l ; : I I:: :: II : Ml : : 

Db 945 TLEWGSPTHPNGVLTSYILKFQPINNTHELGPLVEIRIPANESS LILRNLN-YS 997 

Qy 919 YRHRKRRNGLTSTYAGIRRVPSFTFTPTVTYQRG--GEAVSSGG— -RPGLLNISEPAA 972 

I:: I II M : MM II M I : |:: II 

Db 998 TRYKFYFNAQTSVGSGSQITEE- • ■AVTIMDEAGILRPAVGAGRVQPLYPRIRNVTTAAA 1054 

Qy 973 QPWLADTWPNTGNNHNDCSISCCTAG NGNSDSNLTTYSRP A 1013 

: : :| M :| : : II II: : I I 

Db 1055 ETYANISWEYEGPDHANFYVEYGVAGSREDWRKEIVNGSRSFFVLKGLTPGTAYRVRVGA 1114 

Qy 1014 DCIANYNNQLDMQTNLMLPESTVYGDVDLSNKINEMRTFNSPNLKDGRFVNPSGQPTPY 1073 

: :: : : i M : II:: : ' I |: I 

Db 1115 EGLSGFRSSEDLFETGPAMASR — QVDIATQ GWFI---GLMCAV 1153 

Qy 1074 ATTQLIQSNLSNNMMGSGDSGERHWKPLGQQKQEVAPVQYNIVEQNRLNRDYRANDTVP 1133 

I II m II MM: :| |: : 

Db 1154 ALLILILLIVCFIRRNKGG RYPVREK— -EDAHADPEIQ 1189 

Qy 1134 PTIPYNQSYDQNTGGSYNSSDRGSSTSGSQGH— KRGARTP 1172 

I IM II I : MM IIIMII 
Db 1190 P MKEDDGTFGEYRSLE SDAEDHKPLKKGSRTP 1221 



RESULT 10 
DCCJOUSE 



PRT; 1447 AA. 



DCCJIOUSE STANDARD; 
P70211; 

01-NOV-1997 (Rel. 35, Created) 
Ql-NOV-1997 (Rel. 35, Last sequence update) 
30-MAY-2000 (Rel. 39, Last annotation update) 
TUMOR SUPPRESSOR PROTEIN DCC PRECURSOR. 
DCC, 

Mus musculus (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Euthefia; Rodentia; Sciurognathi; Muridae; Murinae; Mus, 
tl] 

SEQUENCE FROM N.A. 

STRAIN-BALB/C; TISSUE-BRAIN; 

MEDLINE-96112625; PubMed-8570174; 

Cooper H.M., Armes P., Britto J., Gad J., Wilks A.F.; 

"Cloning of the-'mouse homologue of the deleted in colorectal cancer 

gene (mDCC) and' its expression in the developing mouse embryo."; 

Oncogene 11:2243-2254(1995). 

[2] 

REVISIONS, 

STRAIN-BALB/C; TISSUE-BRAIN; 
Cooper H.M.; 1 

Submitted (JUN-1996) to the EMBL/GenBank/DDBJ databases. 

■I- FUNCTION: IMPLICATED AS A TUMOR SUPPRESSOR GENE. 

-!- SUBCELLULAR.' LOCATION : TYPE I MEMBRANE PROTEIN. 

-!• ALTERNATIVE PRODUCTS : TWO FORMS OF THE PROTEIN ARE PRODUCED FROM 

THE SAME GENE BY THE USE OF ALTERNATIVE INITIATION SITES, A THIRD 

FORM WHICH IS EXPRESSED ONLY IN THE EMBRYO IS PRODUCED BY 

ALTERNATIVE SPLICING . 
-!- TISSUE SPECIFICITY: IN THE EMBRYO, EXPRESSED AT HIGH LEVELS IN THE 

DEVELOPING BRAIN AND NEURAL TUBE, IN ADULT, HIGHLY EXPRESSED IN 

BRAIN WITH VERY LOW LEVELS FOUND IN TESTIS, HEART AND THYMUS. 
•!- DEVELOPMENTAL STAGE: LOW LEVELS IN EARLY GESTATION. HIGHEST LEVELS 

EXPRESSED DURING MID GESTATION. LEVELS DECREASE IN LATE GESTATION 

AND REMAIN AT THIS LEVEL IN THE ADULT. 
-I- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
-I- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS . 

This SWISS-PROT. entry is copyright, It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 
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DR 


EMBL; X85788; CAA59786.1; - 






1 :ll 1: * 


DR 


HSSP; P56276; 1TLK. 




Db 


389 GWKSDEGF" 397 


DR 


MGD; MGI: C 


4869; 


DCC. 








DR 


INTERPRO; 


IPR001777; ■. 




Qy 


477 VATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATW 536 


DR 


INTERPRO; 


IPR003006; -. 






; | | . | . | | 


DR 


PFAM; PF00041; fn3; 6. 




Db 




DR 


PFAM; PF00047; ig; 4. 








DR 


PRINTS; PR00014; FNTYPEIII. 




Ov 


537 SAYTEVOFFGVPVOPPRPTnP- -NTJPSAP^KPTOTnV^POTWT^WnPOT.WIlTPT'W SQ4 


KW 


Glycoprotein; Immunoglobuli 


l domain; Transmembrane; Signal; 




1 1 ' 1 1 ' 1 1 • " 1 1 1 1 II 1 1 1 1 • 1 
1 1 • 1 , hi 1 • • • 1 1 1 1 II 1 1 1 1 . 1 


KW 


Anti-oncog 


ene; Alternative 


Initiation; Alternative splicing. 


Db 




FT 


SIGNAL 


1 


25 


POTENTIAL. 






FT 


CHAIN 


26 


1447 


TUMOR SUPPRESSOR PROTEIN DCC, LONG 


Ov 


595 IIEAFSHASGSSWQTVAENVKTET--SAIKGLKPNAIYLFLVRAANAYGISDPSQISDPV 652 


FT 








ISOFORM. 




• 1 1 • • ■ ■ • 1 1 1 1 • 1 1 1 1 1 • 1 I'll' 


FT 


CHAIN 


85 


1447 


TUMOR SUPPRESSOR PROTEIN DCC, SHORT 


Db 


463 TV--FFSREGDNRERALNTTQPGSLQLTVGNLKPEAMYTFRWAYNEWG---PGESSQPI 517 


FT 








ISOFORM. 






FT 


INIT.MET 


85 


85 


FOR SHORT ISOFORM. 


Qy 


653 KTQDVLPTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQY IQGYK 708 


FT 


DOMAIN 


26 


1097 


EXTRACELLULAR (POTENTIAL). 






ft 


TRANSMEM 


1098 


1122 


POTENTIAL. 


Db 


518 KVA '7 — TQPELQVPGPVENLHAVST -SPTSILITW — EPPAYANGPVQGYR 562 


1 


DOMAIN 


1123 


1447 


CYTOPLASMIC (POTENTIAL). 






m 


DOMAIN 


54 


124 


IG-LIKE C2-TYPE DOMAIN. 


Qy 


709 ILYRPSGANHGESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEF 758 


FT 


DOMAIN 


154 


219 


IG-LIKE C2-TYPE DOMAIN, 




: |||:::: | : : | : : : | | : 


FT 


DOMAIN 


254 


317 


IG-LIKE C2-TYPE DOMAIN, 


Db 


563 l ' FCTEVSTGKEQNIEV — DGLSYKLEGLKKFTEYTLRFLAYNRY 604 


FT 


DOMAIN 


345 


407 


IG"LIKE C2-TYPE DOMAIN, 






FT 


DOMAIN 


426 


522 


FIBRONECTIN TYPE- III. 


Ov 


759 -QGADSEIKFAKTLEEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWC 817 




DOMAIN 


525 


618 


FTRROHTTTTIJ TYPF-TTT 

£ 1DI\U11IA.11L1 Ilea 111. 




l • ■ ' 1 1 • 1 1 1 1 1 1 • • • i • i 1 1 1 1 1 1 1 1 1 1 < ii- 
1 • • ■ 1 1 • 1 1 1 1 1 1 • • • 1 • 1 1 II 1 1 1 Mil. 1 1 . 


FT 


DOMAIN 


619 


716 


FIBRONECTIN TYPE- III. 


Db 


605 GPGVSTDDITWTLSDVPSAPPQNISLEV-'VNSRSIKVSWLPPPSGTQNGFITGYKI-R 661 


FT 


DOMAIN 


722 


816 


FIBRONECTIN TYPE- III. 






FT 


DOMAIN 


840 


940 


FIBRONECTIN TYPE- III. 


Qy 


818 LGNETRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNP 877 


FT 


DOMAIN 


941 


1042 


FIBRONECTIN TYPE- III. 




II :h: : : 11=11 :hl I hi 1 : 1 


FT 


DISULFID 


61 


117 


BY SIMILARITY. 


Db 


662 HRKTTRRGEMETLEPNNLWYLFTGLEKGSQYSFQVSAMTVNGTGPPSNWYTAE TP 716 


FT 


DIStJLFID 


161 


212 


BY SIMILARITY. 






FT 


DISULFID 


261 


310 


BY SIMILARITY. 


Qy 


878 VSPEDQVSLAQQISDWKQP AFIAGIGAACWIILMVFSIW 917 


FT 


DISULFID 


352 


400 


BY SIMILARITY. 




: |: : -| | : :| :| | | :: 


FT 


CARBOHYD 


60 


60 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Db 


717 ENDLDESQVPDQPSSLHVRPQTNCIIMSWTPPLNPNIWRGYIIGYGVG SPYAET 771 


FT 


CARBOHYD 


94 


94 


N-LINKED (GLCNAC. . .) (POTENTIAL), 






FT 


CARBOHYD 


299 


299 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Qy 


918 LYRHRKKRNGLTSTYAGIRKVPSFT - -FTPTVTYQRGGEAV- - ■ SSGGRPGLLNISEPAA 972 


FT 


CARBOHYD 
CARBOHYD 


318 


318 


N-LINKED (GLCNAC. . .) (POTENTIAL), 




: 1:1 , 1 1 :: 1 : : III 1 : : :;l 
i . i , i i , , i , . iii i .... i 


FT 


478 


478 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


Db 


7 7 2 VRVDSKQR YYSIERLESSSHYVISLKAFNNAGEGVPLYESATTRS ITDPTDPVD 825 


FT 


CARBOHYD 


628 


* 628 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 






FT 


CARBOHYD 


702 


' 702 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


Qy 


973 QPWLADTWPNTG 984 


FT 


VARSPLIC 


819 


838 


MISSING (IN EMBRYONIC ISOFORM). 


| | :| :| 


SQ 


SEQUENCE 


1447 AA; 158298 


MW; 0D1F1097C22D5B9F CRC64; 


Db 


826 YYPLLDDFPTSGPDVSTPMLPPVGVQAVALTHEAVRVSWADNSVPKNQKTSDVRLYTVRW 885 












Qy 


985 NNHNDCSISCCTAG NGNSD SNLTTY 1009 


Query Match 




7.8%; 


Score 679.5; DB 1; Length 1447; 




: : | : | | | | : : 1 1 1 


t Best Local Similarity 20.4%; 


Pred. No. 3.9e-23; 


Db 


886 RTSFSASAKYKSEDTTSLSYTATGLKPNTMYEFSVMVTKNRRSSTWSMTAHATTYEAAPT 945 


■Matches 346 


Conservative 2,09; Mismatches 592; Indels 545; Gaps 63; 
















Qy 


1010 SRPADCIANYNNQLD NKQT NLMLPE - - - STVYGD - - 1040 


Qy 


71 VEHPSDLIVSKGEPATLNCKAE 


3-RPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSLFF 129 




• : 1 1 : : 1 : 1 : : : 1 : 1 1 




1 1 


1 : 


I II! II 


1 I hi 1 I : II : Ihlll 


Db 


946 SAPKDLTVITREGKPRAVIVSWQPPLEANGKITAYILFYTLDKNIPIDDWIMETISGDRL 1005 


Db 


43 VSEPSDAVTMRGGNVLLNCSAESDRGVPVIKWKKDGLILALGMDD— RKQQLPNGSLLI 99 
















Qy 


1041 - - --VDLSNKINEMKTF- -NSPNLKDGRFVNPSGQPTPYATTQLIQSNLSNNMNNGSGDS 1094 


Qy 


130 LRIVHGRKSRPDEGVYVCVAR-NYLGEAVSHNASLEVA-ILRDDFRQNPSDVMVAVGEPA 187 




:lll 1 1 : 1:1 1 i 1 • 1 • ! II 

■III J ■ 1 1 ' 1 • 1 1 I 1 • 1 ' • ' f II 




1: 


1 :MM:i | | 


1 :| 1 : II II 1 : :|: 


Db 


1006 THQIMDLS--LDTMYYFRIQARNVKG---VGPLSDPILFRTLKVEHPDKMANDQGRHGDG 1060 


Db 


100 QNILHSRHHKPDEGLYQCEASLADSGSIISRTAKVTVAGPLR--FLSQTESITAFMGDTV 157 
















Qy 


1095 GEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYNQSYDQNTGGSYNSSD 1154 


Qy 


188 VMECQPPRGHPEPTISWKR- - - 


DGSPLDDKDERITIRGGKLMITYTRKSDAGKYVCVGTN 244 




| : | : || | ||: 1 1 : : II 






1 


III hi 


Ml : : | 1 !: : |:l | I 


Db 


1061 G--YW ■ PVDTNLIDRSTLNEP PIGQMHPPH- -GSVTPQK 1094 


Db 


158 LLKCE-VIGEPMPTIHWQKNQQDLNPLPGDSRWVLPSGALQISRLQPGDSGVYRCSARN 216 
















Qy 


1155 RG SSTSGSQGHKKGARTPKVPKQGGMNWADLLPPP 1189 


Qy 


245 MVGERESEVAELTVLERPS-- 


- -FVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKD 298 




; | :| || | ::| || || 






1 


: :| 1 


|::IM|: 1 :| 1 1 |: 1 : 


Db 


1095 NSNLLVITWTVGVLTVLVWIVAVICTRRSSAQQRKKRATHSGSKRKGSQK- -DLRPPD 1152 


Db 


217 PASIRTGNEAEVRILSDPGLHRQLYFLQRPSNVIAIEGKDAVLECCVSGYPPPSFTWLRG 276 
















Qy 


1190 PAHPPPHSNSEEYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPTPPVRGAASS 1249 


Qy 


299 DG--ELPRSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWK 356 




::: :|:| : |: II : 






1 : 


: 1 1 


II 1 Mill III III II 1: 


Db 


1153 r LWIHHEEMEMKNIEK-PT GTD 1172 


Db 


277 EEVIQLRSKKYSLLGGSNLLISNVTDDDSGTYTCWTYKNENISASAELTVLVPPWFLNH 336 
















Qy 


1250 PAAVSYSHQSIATLTPSPQEELQPMLQDCPEETGHMQHQPDRRRQPVSPPPPPRPISPPH 1309 


Qy 


357 PRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTIT 416 




II II III 1 1 : : 1 1 








: 1:1 :| I 1 : 1 : 1 : 1 : :| 1 


Db 


1173 PAGRDSPIQSCQDLTP VSHSQSETQMGSKSAS H 1205 


Db 


337 PSNLYAYESMDIEFECAVSGKPVPTVNWMKNGDWIPSDY FQIVGGSNLRIL 388 
















Qy 


1310 TYGYISGPLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQ TPASSVGDL 1362 


Qy 


417 NVQRSDVGYYICQTLNVAGSIITRAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSC 476 




II ; : II 1: II |: :| III 
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\iy 


1363 


Db 


1257 


0y 


1407 


Db 


1314 


Qy 


1461 


Db 


1363 


Qy 


1519 


pb 


1410 



- -SGQDTEDAGSSMSTLERSLA- ■ - ■ ARRATRAKLMIPMEAQSSNPAWSAI PVPTL 1256 



- -WGSASEEDNISSGRSSVSSSDGSFFTDADFAQA 1406 

: : I : :|h I 1:1 I 



SQCPRPTSPVSTDSN MSAAVMQKTRP 1460 

I III I: : :| III :: I 
- -CVRPTHPLRSPANPLLPPPMSA- -IEPKVP 1362 



I III 



I :| :|: : I : : II II I : : 
■ -PTLPKTHVKTASLGLAGKARSPLLPVSVPTAPEVSEES 1409 



STANDARD; PRT; 1447 AA. 



RESULT 11 
DCCJUMAN 

« DCCJUMAN 
P43146; 
01-NOV-1995 (Rel. 32, Created) 
DT 01-KOV-1995'.(Rel. 32, Last sequence update) 
DT 15-JUL-1999/(Rel, 38, Last annotation update) 

DE TUMOR SUPPRSSSOR PROTEIN DCC PRECURSOR (COLORECTAL CANCER SUPPRESSOR) . 

GN DCC. » 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrate; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 

RN [1] ' 

RP SEQUENCE FROM N.A. 

RX MEDLINE-95011532; PubMed=7926722; 

ra fiedrick L./iCho K.R., Fearon E.R., Wu T.-C, KinzlerXW., 

RA Vogelstein 6.; • 

RT "The DCC geie product in cellular differentiation and colorectal 

RT tumorigenesis."; 

RL Genes Dev. 8:1174-1183(1994). 

RN [2] 

RP SEQUENCE OF 1-750 FROM N.A. 

RX MEDLINE-90100559 ; PubMed-2294591 ; 

RA Fearon E.R., Chof.R., Nigro J.M., Kern S.E., Simons J,W., 

iRA Ruppert J.M., Hamilton S.R., Preisinger A.C., Thomas G., Kinzler K.W., 

RA Vogelstein B,; 

RT "Identification of a chromosome 18q gene that is altered in 
RT colorectal cancers."; 
RL Science 247:49-56(1990). 
[3] 

SEQUENCE OF 107-472 FROM N.A. (SCRAMBELD EXONS). 
MEDLINE-91121517 ; PubMed-1991322 ; 
Nigro J.M., Cho K.R., Fearon E.R., Kern S.E., Ruppert J.M., 
Oliner J.D., Kinzler K.W., Vogelstein B.; 
RT "Scrambled exons."; 
RL Cell 64:607-613(1991). 
RN [4] 

RP GENE STRUCTURE, AND VARIANTS CARCINOMA HIS-1375. 

RX JffiDLINE-94245241; PubMed-8188295; 

RA Cho K.R., Oliner J.D., Simons J.W., Hedrick L.., Fearon E.R., 

RA Preisinger A.C., Hedge P., Silverman G.A., Vogelstein B.; 

RT "The DCC gene: structural analysis and mutations in colorectal 

RT carcinomas . " ; 

RL Genomics 19:525-531(1994). 

RN [5] 

RP VARIANT CARCINOMA THR-168, AND VARIANT GLY-201. 

RX MEDLINE-94243823; PubMed-8187090; 

RA ' Miyake S., Nagai K. , Yoshino K., Oto M, , Endo M., Yuasa Y.; 

RT "Point mutations and allelic deletion of tumor suppressor gene DCC in 

RT human esophageal squamous cell carcinomas and their relation to 

RT metastasis,"; 

RL Cancer Res. 54:3007-3010(1994). 

CC -!- FUNCTION: IMPLICATED AS A TUMOR SUPPRESSOR GENE. 

CC •!• SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 



f 



CC -I- TISSUE SPECIFICITY: FOUND IN AXONS OF THE CENTRAL AND PERIPHERAL 
CC NERVOUS SYSTEM AND IN DIFFERENTIATED CELL TYPES OF THE INTESTINE, 

CC -!■ DISEASE: COLORECTAL TUMORS THAT LOST THEIR CAPACITY TO 
CC DIFFERENTIATE INTO MUCUS PRODUCING CELLS UNIFORMLY LACK DCC 
CC EXPRESSION, : INACTIVATION OF DCC DUE TO ALLELIC DELETION AND/OR 
CC POINT MUTATIONS MAY CAUSE BOTH LYMPHATIC AND HEMATOGENOUS 
CC METASTASIS OF OESOPHAGEAL SQUAMOUS CELL CARCINOMAS, 

CC -!- SIMILARITY: -CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC - 

CC This SWISS-PROT'.entry is copyright. It is produced through a collaborator 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute, There are no restrictions on it; 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

cc : - 

DR EMBL; X76132; CAA53735.1; -. 

DR EMBL; M32292; AAA35751.1; -. 

DR EMBL; M32286; AAA52174.1; -. 

DR EMBL; M32288; AAA52175.1; ALT_SEQ. 

DR EMBL; M32290; AAA52176.1; 

DR EMBL; M63696; AAA52177.1; -. 

DR EMBL; M63700; AAA52178.1; -, 

DR EMBL; M63702; AAA52179.1; -. 

DR EMBL; M63718; AAA52180.1; -. 

DR EMBL; M63698; AAA52181.1; -. 

DR PIR; A54100; A54100. 

DR PIR; A40098; A40098. 

DR PIR; A38442; A38442. 

DR HSSP; P56276; 1TLK. 

DR MIM; 120470; -. . 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPRO03OO6; -. 

DR PFAM; PF00041; fn3; 6. 

DR PFAM; PF00047; ig; 4. 

DR PRINTS; PR00014; FNTYPEIII. 

KW Glycoprotein; Immunoglobulin domain; Transmembrane; Signal; 

KW Anti -oncogene; Disease mutation; Polymorphism, 



FT 


SIGNAL 


1- 


25 


POTENTIAL. 


FT 


CHAIN 


26-.' 


1447 


TUMOR SUPPRESSOR PROTEIN DCC. 


FT 


DOMAIN 


26 


1097 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


1098 


1122 


POTENTIAL. 


FT 


DOMAIN 


1123 


1447 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


Si; 


124 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


154,, 


219 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


254 


317 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


345 


407 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


426', 


522 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


525- 


618 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


619.' 


716 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


722' 


816 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


840- 


940 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


94i; 


1042 


FIBRONECTIN TYPE-III. 


FT 


DISULFID 


61; 


117 


BY SIMILARITY. 


FT 


DISULFID 


• lei; 


212 


BY SIMILARITY. 


FT 


DISULFID 


261> 


310 


BY SIMILARITY. 


FT 


DISULFID 


352 


400 


BY SIMILARITY. 


FT 


CARBOHYD 


'94: 


94 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


299i 


299 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CARBOHYD 


318' 


318 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


478, 


478 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


628, 


628 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


702' 


702 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


VARIANT 


168; 


168 


M •> T (IN OESOPHAGEAL CARCINOMA). 


FT 








/FTId=VAR_003909. 


FT 


VARIANT 


20L 


201 


R -> G. 


FT 








/FTId-VAR_003910, 


FT 


VARIANT 


1375 


1375 


P -> H (IN A COLORECTAL CARCINOMA), 


FT 








/FTId-VAR.003911. 


FT 


CONFLICT 


138' 


138 


MISSING (IN REF. 3). 


FT 


CONFLICT 


233', 


329 


MISSING (IN REF. 3). 



Mon Jan 22 13:04:42 2001 



us-09-540-245a-18.rsp 



Page 17 



FT CONFLICT 421 421 MISSING (IN REF. 3) . 

SQ SEQUENCE 1447 AA; 158456 MW; 4A8612766ED0471F CRC64; 



Query Match «, 7.7%; Score 667.5; DB 1; Length 1447; 

Best Local Similarity 22.8%; Pred. No. l,3e-22; 

Matches 351; Conservative 185; Mismatches 570; Indels 431; Gaps 

Qy 53 SLGYTGSRLRQE DFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGER 107 

III HI : : I I ■ : : |: I |: II III I I : 

Db 120 SLGDSGSIISRTAKVAVAGPLRFLSQTESVT'AFMGDTVLLKCEVIGEPMPTIHWQKNQQD 179 

Qy 108 VETDKDDPRSHRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAI 167 

: I I ::|MI:| I: I I hi I III : I II I 
Db 180 LTPIPGDSRV- -WLPSGALQISRLQPG DIGIYRCSARNPASSRTGNEA- -EVRI 230 

Qy 168 LRDD FRQNPSDVMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITI 220 

»l I II ll:|: ! |: l|:|| |:| |: :| : : : :: :: 

231 LSDPGLHRQLYFLQRPSNWAIEGKDAVLEC -CVSGYPPPSFTWLRGEEVIQLRSKKYSL 289 

Qy 221 RGG-KLMITYTRKSDAGKyVCyGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSA 279 

II hh hi I ll| I II lllllh I h Nil 
Db 290 LGGSNLLISNVTDDDSGMyTCWT-YKNENISASAELTVLVPPWFLNHPSNLYAYESMDI 348 



Db 

Qy 

Db 

Qy 
Db 

j Qy 



280 EFKCEARGDPVPTVRWRKD-DGELPKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGK 338 

Ihl I Mill i h I > :| ::| h! I II I lllll I 
349 EFECTVSGKPVPTVNWMKNGDWIPSDYFQIVGGSNLRILGWKSDEGFYQCVAENEAGN 408 

339 AEASATLTVQEP • ■ PHFW " - -ftPRDQWAL - -GRTVIFQ — CEATGNPQP • AIFWRR 386 

h II I I :| I I | III I I I I IMI I :h I 

409 AQTSAQLIVPKPAIPSSSVLPSAPRDWPVLVSSRFVRLSWRPPAEAKGNIQTFTVFFSR 468 

387 EGSQNLLFSYQPPQSSSRFSVSQTG - -DLTITNVQRSDVGYYICQTLNVAG S 436 

II : ;!| I ||: |:: : : | | 

469 EGDNR ERALNTTOPGSLQLTVGNLKPEAMYTFRVVAYNEWGPGESSQPIK 518 

437 IITRAYLEV TDVIADRPPPVIRQGPV NQTVAV 468 

: I: IH j I II III I : I 

519 VATQPELQVPGPVENLQAVSTSPTSILITWEPPAYANGPVQGYRLFCTBVSIGKEQNIEV 578 

469 DGTFVLSCVATGSPVPT ! - ILWRKDGVLVSTQDSRIKQL 505 

II II I I i r : : I III I : I 
579 DG- - -LSYKLEGLKKFTEYSLRFLAYNRYGPGVSTDDITWTLSDVPSAPPQNVSLEWN 635 

506 ENGVL— QIRYAKLGDTGRYTCIASTP SGEATWSAY — 539 

:M : :ll: I I : 'I II 

636 SRSIKVSWLPPPSGTQNGFITGYKIRHRKTTRRGEMETLEPNNLWYLFTGLEKGSQYSFQ 695 

540 — IEVQEFGVPVQ PPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGAT 590 

: I II I ; I : :l II I I : :|| I II 

696 VSAMTVNGTGPPSNWYTAETPENDLDESQVPDQPSSLHVRP-QTNCIIMSWTPPLNPNIV 754 

591 PTSYIIEAFSHASGSSW-QTVAENVKTETSAIRGLKPNAIYLFLVRAANAYGISDPSQIS 649 
III : II : :|| : I :|: h :: h ::| I I I : 

755 VRGYII— GYGVGSPYAETVRVDSKQRYYSIERLESSSHYVISLKAFNKAGEGVP-LY 809 

650 DPVKTQDVLPTSQGVDHKQVQRELGNAVLHLHNP TVLSSSSIEVHWT 696 

: I: : : II: : : :| I I |: :: I I 

810 ESATTRSITDPTDPVDYYPLLDDFPTSVPDLSTPMLPPVGVQAVALTHDAVRVSWADNSV 869 

697 -VDQQSQYIQGYKILYRPSGANHGESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFF 755 

:h: :: | : :| | : : :: | |: || 
870 PKNQKTSEVRLYTVRWRTSFSASAK Y KSEDTT SLS YT ATGLKPNTMYEFS VMVTK 924 

756 NEFQGADSEIKFAKTLEEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKV 815 
I I I I I lh: I: II : |::|||||| I II : I : 

925 NRRSSTWSMTAHATTYEAAPTSAPKDFTVITREGKPRAVIVSWQPPLE- -ANGKITAYIL 982 

816 WCLGNETRYHINK TVDGSTFSWIPFLVPGIRYSVEVAASIGAGSGVKSE 865 

: I ::l h I : I I I : I I I h 

983 F YTLDKNIPIDDWIMETISGDRLTHQIMDLNLDTMYYFRIQARNSKGVGPLSD 1035 

866 PQFIQL DAHGN — PVSPE- -DQVSLAQQISDWKQP 897 



I : lh II h :| : : I 

Db 1036 PILFRTLKVEHPDKMANDQGRHGDGGYWPVDTNLIDRSTLNEPPIGQMHPPHGSVTPQKN 1095 

Qy 898 AFIAGIGAACWIILMVFSIWLYR HRKKRNGLTSTYAGIRKVPSFTFTPTV 947 

: il :::" :: I llll : II II I 
Db 1096 SNLLVIIWTVGVITVLWVIVAVICTRRSSAQQRKKR— ATHSAGKRKGSQKDLRPPD 1152 

Qy 948 TYQRGGEAVSSGGRPGLLNISEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLT 1007 

: I : II :h I III :|| 

Db 1153 LWIHHEEM------EMKNIEKPS GTDPAGRDSPIQSC QDLT 1187 

Qy 1008 TYSRPADCIANYNNQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLK DGR 1062 

I :■* II :| |: :: I :| : :| | | : 
Db 1188 PVSH SQSETQLGSKSTSHSGQDTEEAGS-SMSTLERSLAARRAPRAKLMIPMDAQ 1241 

Qy 1063 FVNP— SGQPTPYATTQLIQSNLSNNMNNG—SGDSGEKHWKPLGQQKQEVAPVQYNI 1116 

II! | : III : : || : 

Db 1242 SNNPAWSAIPVP TLESAQYPGILPSPTCGYPH PQFTLRPVPF-- 1284 

Qy 1117 VEQNKLNKDYRANDTVPPTIPYNQSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKVPK 1176 

lh ' .1 111 II :| I : I 
Db 1285 PTL SVDRGFGAGRSQSVSEGPTTQQPP - 1311 

Qy 1177 QGGMNWADLLPPPPAHPPPHSNSEEYNISVDESYDQEMPCP— VPPARMYLQQDELEEE 1233 

:HI I Ml | :| | : 
Db 1312 MLPP— SQPEHSSSEE APSRTIPTACV 1336 

Qy 1234 EDERGPTPPVRGAASSPAAVSYSHQSTATLTPSPQEELQPMLQDCPEETGHMQHQPDRRR 1293 

II hi h I I I ::| : I : II 
Db 1337 RPTHPLRSFAN PLLPPPMSAIEPKVPYTP LLSQPG — 1371 

Qy 1294 QPVSPP : PPPRPISPPHTYGYISGPLVSDMDTDAPEE EED 1332 

II ' I hi I : I lh i I hi 

Db 1372 ■ PTLPKTHVKTASLGLAGKARSPLIPVSVP TAPEVSE - ESHKPT EDS ANVYEQD 1423 

Qy 1333 EADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGS 1369 

: ::| ::■ |:: I :::||| 
Db 1424 DLSEQMASLEG— LMKQL NAITGS 1445 



STANDARD; PRT; 1040 AA, 



RESULT 12 
AXOIJUMAN 

ID AXOIJUMAN 

AC Q02246; 

DT 01-JUL-1993 (ReJ.. 26, ( 

DT 01-JUL-1993 (Rel. 26, Last sequence update) 

DT 15-JUL-1999 (Rel, 38, Last annotation update) 

DE AXONIN-1 PRECURSOR (AXONAL GLYCOPROTEIN TAG ■ 1 ) (TRANSIENT AXONAL 

DE GLYCOPROTEIN 1) . 

GN TAXI OR TAG1. > 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

RN [1] ; 

RP SEQUENCE FROM ft. A. 

RC TISSUE-BRAIN; 

RX MEDLINE-93145965; PubMed-8425542; 

RA Hasler T.H., Rader c, Stoeckli e.t. ( zuellig R.A., Sonderegger P.; 

RT "cDNA cloning, structural features, and eucaryotic expression of 

RT human TAG-l/axonin-1."; 

RL Eur. J, Biochem. 211:329-339(1993), 

RN [2] 

RP SEQUENCE FROM IT. A. 

RC TISSUE-BRAIN; ,' 

RX MEDLINE-94140354; PubMed-8307567; 

RA Tsiotra CP., Karagogeos D,, Theodorakis K, , Michaelidis M.T., 

RA Modi W.S., Furley J.A., Jessel M.T., Papamatheakis J.; 

RT "Isolation of the cDNA and chromosomal localization of the gene 

RT (TAXI) encoding: the human axonal glycoprotein TAG-l."; 

RL Genomics 18:562-567(1993). 

CC -!- FUNCTION; MAY PLAY A ROLE IN THE INITIAL GROWTH AND GUIDANCE OF 

CC AXONS. MAY BE INVOLVED IN CELL ADHESION. 

CC -!• SUBCELLULAR. LOCATION: ATTACHED TO THE NEURONAL MEMBRANE BY A 
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GPI -ANCHOR AND IS ALSO RELEASED FROM NEURONS. 
-I- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS, 
-I- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III-LIKE DOMAINS. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
ijse by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires^ license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license?isb-sib.ch). 

EMBL; X68274; CAA48335.1; -. 
EMBL; X67734; CAA47963.1; -. 
PIR; S28830; S28830. 
MIM; 190197; -. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 4. 
PFAM; PF00047; ig; 6. 

Immunoglobulin domain; Glycoprotein; Signal; GPI -anchor ; 
Cell adhesion; Repeat. 



1 


SIGNAL 


1 


28 






CHAIN 


29 


1012 


AXONIN-1. 


FT 


PROPEP 


1013 


1040 


REMOVED IN MATURE FORM (POTENTIAL). 


FT 


DOMAIN 


54 


118 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


148 


216 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


254 


313 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


341 


402 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


433 


495 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


523 


594 


IG-LIKE C2-TYPE DOMAIN; 


FT 


DOMAIN 


606 


612 


GLY/PRO-RICH. 


FT 


DOMAIN 


611 


706 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


714 


809 


FIBRONECTIN IYPE-III. 


FT 


DOMAIN 


816 


908 


FIBRONECTIN IYPE-III. 


FT 


DOMAIN 


917 


1003 


FIBRONECTIN TYPE-III. 


FT 


SITE 


794 


796 


CELL ATTACHMENT SITE (BY SIMILARITY). 


FT 


CARBOHYD 


76 


76 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


198 


198 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


204 


204 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


461 


461 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


477 


477 


N-LINKED (GLCNAC. . ,) (POTENTIAL). 


FT 


CARBOHYD 


498 


498 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


525 


525 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


830 


830 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


918 


918 


N-LINKED (GLCNAC. . .), (POTENTIAL) , 


FT 


CARBOHYD 


940 


940 


N-LINKED (GLCNAC. . .), (POTENTIAL) , 


FT 


LIPID 


1012 


1012 


GPI-ANCHOR (POTENTIAL). 



SEQUENCE 1040 AA; 113393 MW; 254E78DD3C28EFB6 CRC64; 



P Query Match 7.54; Score 653.5; DB 1; Length 1040; 

Best Local Similarity 24.44; Pred. No. 3.7e-22; 
Matches 227; Conservative 115; Mismatches 400; Indels 187; Gaps 26; 

Qy 63 QEDFPPRIVEHPSDLIV- - -SKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHR 119 

Ml:!:: I I I I :| I I I I : I :| I 
Db 32 QTTFGPVFEDQPLSVLFPEESTEEQVLLACRARASPPATYRWKMNGTEM-- 



Qy 120 MLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDV 179 

I: |:| :: I III hi I :| II II |:: :: I 

Db 89 QLV -GGNL — VIMNPT KAQDAGVYQCLASNPVGT WSREAILRFGFLQEFSKEERDPV 143 

Qy 180 MVAVGEPAVMECQPPRGHPEPTISWKKDGSP- • -LDDKDERITIRGGKLMITYTRKSDAG 236 

! :: I II :| : I : I I II! I II! 

Db 144 KAHEGWGVMLPCNPPAHYPGLSYRWLLNEFPNFIPTDGRHFVSQTTGNLYIARTNASDLG 203 

Qy 237 KYVCVGTNMVGERESEV AELTVLERPSFVKR-PSNLAVTVDDSAEFKCEA 285 

II: I: : I II I III I h I HI 

Db 204 NYSCLATSHMDFSTKSVFSKFAQLNLAAEDTRLFAPSIKARFPAETYALVGQQVTLECFA 263 

Qy 286 RGDPVPTVRWRKDDGELPKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATL 345 
1:111 ::IM II I I : Ihl h I hi I III h : 



Db 264 FGNPVPRIKWRKVDGSL- - SPQWTTAEPTLQIPSVSFEDEGTYECEAENSKGRDTVQGRI 321 

Qy 346 TVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRF 405 

II I :: , I :| : : I I I hi : I I I :l hi 

Db 322 IVQAQPEWLKVISDTEADIGSNLRWGCAAAGKPRPTVRWLRNG EPLASQNRV 373 

Qy 406 SVSQTGDLTISNVQRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQT 465 

I III : : II I I 11:1 I I I : I III: 
Db 374 EV-LAGDLRFSKLSLEDSGMYQCVAENKHGTIYASAELAVQALAPD FRLNPVRRL 427 

Qy 466 V-AVDGTTOSCVATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRY 523 

: I I :: I :| Mill : II: :| I II I hi 
Db 428 IPAARGGEILIPCQPRAAPKAWLWSK-GTEILVNSSRVTVTPDGTLIIRNISRSDEGKY 486 

Qy 524 TCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNL 559 

II I hi : : h: I I II 

Db 487 TCFAENFMGKANSTGILSVRDATKITLAPSSADINLGDNLTLQCHASHDPTMDLTFTWTL 546 

Qy 560 r 559 

Db 547 DDFPIDFDKPGGHYRRTNVKETIGDLTILNAQLRHGGKYTCMAQTWDSASKEATVLVRG 606 

Qy 560 IPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEAFSHASGSSWQTVAENV 614 

I I. Ph h III :: : I ::l : :l h I I 
Db 607 PPGPPGGVWRDIGDTTIQLSWSRGFDNHSPIAKYTLQARTPPAG-KWKQVRTNPANIEG 665 

Qy 615 KTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQRELG 674 

lh : 11:1 I I I hi I :H I ::h: h 
Db 666 NAETAQVLGLTPWMDYEFRVIASNILGTGEPSGPSSKIRTREAAPSVA 713 

Qy 675 NAVLHLHNPTVLSS SSIEYHWTVDQQSQYIQ GYKILYRPSGANHGESDWL 724 

I: II : 1 : 1 1 hi II : :| h I :: 

Db 714 PSGLSGGGGAPGELIVNWT-PMSREYQNGDGFGYLLSFRRQGSTHWQT— 760 

Qy 725 VFEVRTPAKNSWI — PDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPPQ 780 

II:: :| '-hi I : I :l Mill 

Db 761 - - - ARVPGADAQYFVYSNESVRPYTPFEVKIRSYNRRGDGPESLTALVYSAEEEPRVAPT 817 

Qy 781 GV- ■ -TVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKV- -WCLGNETRYHINKTVDGSTF 835 

I II : ! : I hhl :| lh: h= I h= I 
Db 818 KVWAKGVSSSEMN VTWEPVQQD-MNGILLGYEIRYWRAGDKEAAADRVRTAGLDT 871 

Qy 836 SWIPFLVPGIRYSVEVAASTGAGSGVKS 864 

I : II :l I 1 I Ihl I 
Db 872 SARVSGLHPNTKYHVTVRAYNRAGTGPAS 900 



RESULT 13 
AXOIJAT 

ID AXOIJAT ^STANDARD; PRT; 1040 AA. 

AC P22063; 

DT 01-AUG-1991 (Rel. 19, Created) 

DT 01-AUG-1991 (Rel. 19, Last sequence update) 

DT 15-JUL-1999 (Rel. 38, Last annotation update) 

DE AXONIN-1 PRECURSOR (AXONAL GLYCOPROTEIN TAG-1). 

GN TAXI. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

RN [1] 

RP SEQUENCE FROM tj.A., AND SEQUENCE OF 31-41. 

RC TISSUE-SPINAL CORD; 

RX MEDLINE-90199890; PubMed-2317872; 

RA Furley A.J., Morton S.B., Manalo D., Karagogeos D., Dodd J., 

RA Jessell T.M.; •, ' 

RT "The axonal glycoprotein TAG-1 is an immunoglobulin superfamily 

RT member with neurite outgrowth-promoting activity."; 

RL Cell 61:157-170(1990). 

CC -1- FUNCTION: MAY PLAY A ROLE IN THE INITIAL GROWTH AND GUIDANCE OF 

CC AXONS. MAY BE INVOLVED IN CELL ADHESION. 

CC -I- SUBCELLULAR LOCATION: ATTACHED TO THE NEURONAL MEMBRANE BY A 

CC GPI-ANCHOR 'AND IS ALSO RELEASED FROM NEURONS. 

CC ■!• TISSUE SPECIFICITY: IN NEURAL TISSUES IN EMBRYOS, AND IN ADULT 
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BRAIN, SPINAL CORD AND CEREBELLUM. 
-!- DEVELOPMENTAL STAGE: TRANSIENTLY EXPRESSED ON A SUBSET OF AXONS 

IN THE DEVELOPING RAT NERVOUS SYSTEM. 
-!• SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS, 
-!- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III -LIKE DOMAINS. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; M31725; AAA42201.1; -. 
PIR; A34695; A34695. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF0QQ41; fn3; 4. 
PFAM; PF00047; ig; 6. 

Immunoglobulin domain; Glycoprotein; Signal; GPI-anchor; 
Cell adhesion; Repeat. 
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SEQUENCE 1040 AA; 113042 MW; 6E707EF6614CB4FB CRC64; 



Query Match 7.5*; Score 652; DB 1; Length 1040; 

Best Local Similarity 24,5%; Pred. No. 4.3e-22; 

Matches 234; Conservative 117; Mismatches 428; Indels 176; Gaps 27; 

2y 52 NSLGYTGSRLRQEDFPPRIVEHPSDLIV---SKGEPATLNCKAEGRPTPTIEWYKGGERV 108 

:| I" :: Mill: I : II hi II I I 
Jb ■ 23 SSPGWSFAQGTPATFGPIFEEQPIGLLFPEESAEDQVTLACRARASPPATYRWKMNG--- 79 

Jy 109 ETDKD-DPRSHRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAI 167 

II : :| I I: hi I III hi I :| II II 

)b 80 -TDMNLEPGSRHQLM-GGNL— -VIMSPTKTQDAGVYQCLASNPVGTWSKEAVLRFGF 133 
\ 

2y 168 LRDDFRQNPSDVMVAVGEPAVMECQPPRGHPEPTISWKKDGSP— LDDKDERITIRGGK 224 

h: :: I , I : : I 1 1 : 1 : I : I | :: | 
Jb 134 LQEFSKEERDPVKTHEGWGVMLPCNPPAHYPGLSYRWLLNEFPNFIPTDGRHFVSQTTGN 193 

}y 225 LMITYTRKSDAGKYVCVGTNMVGERESEV AELTVLERPSFVKR - PSNLAV 273 

I I I II I I h h : I II I II I I 

Db 194 LYIARTNASDLGNYSCLATSHMDFSTKSVFSKFAQLNLAAEDPRLFAPSIKARFPPETYA 253 

}y 274 TVDDSAEFKCEARGDPVPTVRWRKDDGELPKSRYEIRDDHTLKIRKVTAGDMGSYTCVAE 333 

I :! I hill ::IM II II : Ihl h I hi I II 



Db 254 LVGQQVTLECFAFGNPVPRIKWRKVDGSL--SPQWATAEPTLQIPSVSFEDEGTYECEAE 311 

Qy 334 NMVGKAEASATLTVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLL 393 

I h II I :: I :| : : I I I hi : I I I 

Db 312 NSKGRDTVQGRIIVQAQPEWLKVISDTEADIGSNLRWGCAAAGKPRPMVRWLRNG 366 

Qy 394 FSYQPPQSSSRFSVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRP 453 

:| I :.l I III : : I I I I I hi I I I : I 
Db 367 ---EPLASQNRVEV-LAGDLRFSKLSLEDSGMYQCVAENKHGTIYASAELAVQALAPD-- 42C 

Qy 454 PPVIRQGPVNQTV-AVDGTFVLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQ 511 

II II V I I : I :| llll II: =h :| I 
Db 421 — FRQNPVRRLIPAARGGEISILCQPRAAPKATILWSK-GTEILGNSTRVTVTSDGTLI 476 

Qy 512 IRYAKLGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNL 559 

II I hill I hi : : h: I I h 

Db 477 IRNISRSDEGKYTCFAENFMGKANSTGILSVRDATKITLAPSSADINVGDNLTLQCHASH 536 



Qy 560 ■ 



■ 559 



Db 537 DPTMDLTFTWTLDDFPIDFDKPGGHYRRASAKETIGDLTILNAHVRHGGKYTCMAQTWD 596 

Qy 560 IPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEAFSHASGSSW 607 

'II I h II III :: : I ::| : II I 
Db 597 GTSKEATVLVRGPPGPPGGWVRDIGDTTVQLSWSRGFDNHSPIAKYTLQARTPPSG-KW 655 

Qy 608 QTVAENV KTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTSQ 662 

: I I : lh : II I I I I hi I :|| I ::|:: :|: 
Db 656 KQVRTNPVNIEGNAETAQVLGLMPWMDYEFRVSASNILGTGEPSGPSSKIRTKEAVPSVA 715 

Qy 663 GVDHKQVQRELGNAVLHLHNPTVLSS SSIEVHWT "VDQQSQYIQGYKILYRPSGA 716 

h II : ::|| I :: I h II 

Db 716 - PSGLSGGGGAPGELIINWTPVSREYQNGDGFG-YLLSFR 753 

Qy 717 NHGESDWLVFEVRTPAKNSWI — PDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLE 772 

I I I < I I :: :: :|:| I : I :|' : I 

Db 754 RQGSSSWQT-ARVPGADAQYFVYGNDSIQPYTPFEVKIRSYNRRGDGPESLTALVYSAE 811 

Qy 773 EAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKV-WCLGNETRYHINKTV 830 

III I : : : llhl :| ||:: |:: I |: 
Db 812 EEPRVAP--AKVWAKGSSSSEMNVSWEPVLQD-MNGILLGYEIRYWKAGDNEAAADRVRT 868 

Qy 831 DGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVS 885 

I I : I I :| I I I Ihl II: I I :| 
Db 869 AGLDTSARVTGLNPNTKYHVTVRAYNRAGTGPAS-PSADAMTVKPPPRRPPGNIS 922 



RESULT 14 
AX01_CHICK 

ID AXOl.CHICK STANDARD; PRT; 1036 AA, 

AC P28685; f 

DT 01-DEC-1992 (Rel. 24, Created) 

DT 01-DEC-1992 (Rel. 24, Last sequence update) 

DT 15-JUL-1999 (Rei. 38, Last annotation update) 

DE AXONIN-1 PRECURSOR, 

OS Gallus gallus (Chicken), 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus. , 

RN [1] 

RP SEQUENCE FROM N.A., AND PARTIAL SEQUENCE. 

RC TISSUE=BRAIN; . 

RX MEDLINE-92174893; PubMed=1311675; 

RA Zuelllg R.A., R;ider C, Schroeder A., Kalousek M.B., 

RA von Bohlen Und Halbach F., Osterwalder T., Inan C, Stoeckli E.T., 

RA Affolter H.-U./^Fritz A., Hafen E. , Sonderegger P.; 

RT "The axonally secreted cell adhesion molecule, axonin-1. Primary 

RT structure, immuiioglobulin-like and f ibronectin - type- III - like domains 

RT and glycosyl-phosphatidylinositol anchorage."; 

RL Eur. J. Biochem. 204:453-463(1992), 

CC -I- FUNCTION: MON-ASSOCIATED CELL ADHESION MOLECULE (AXCAM) WHICH 
CC PROMOTES NE'JRITE OUTGROWTH BY INTERACTION WITH THE AXCAM Ll (G4) 
CC OF NEURIT IC MEMBRANE, 
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CC -!" SUBCELLULAR LOCATION: ATTACHED TO THE NEURONAL MEMBRANE BY A 
CC GPI- ANCHOR. 

CC -!■ SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN- LIKE C2-TYPE DOMAINS. 
CC •!• SIMILARITY: CONTAINS 4 FIBRONECT IN TYPE III-LIKB DOMAINS. 

CC 

CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 



DR 



DR 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics institute, There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license, agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch) , 



EMBL; X63101; CAA44815.1; 
PIR; S22128; S22128. 
PIR; S22383; S22383. 
HSSP; P56276; 1TLK. 
DR INTERPRO; IPROQ1777; -. 
DR INTERPRO; IPR003006; -. 

PFAM; PF00041; fn3; 3. 
DR PFAM; PF00047; ig; 6. 

|W Immunoglobulin domain; Glyc|protein; Signal; GPI-anchor; 
Cell adhesion; Repeat, 



m 


SIGNAL 


1 


23 


OR 25 (POTENTIAL) . 


FT 


CHAIN 


24 


1036 


AXONIN-1. 


FT 


PROPEP 


? 


1036 


REMOVED IN MATURE FORM. 


FT 


DOMAIN 


49 


113 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


143 


211 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


249 


308 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


336 


397 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


428 


490 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


518 


589 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


599 


608 


HINGE i(POTENTIAL) . 


FT 


DOMAIN 


601 


607 


GLY/PRO-RICH. 
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N-LINKED (GLCNAC. . .) (POTENTIAL) , 
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SQ SEQUENCE 1036 AA; 113301 jMW; . 08B80143BE779794 CRC64; 



P Query Match 7.3%; I Score 1633.5; DB 1; Length 1036; 

Best Local Similarity 23.2%;, Pred. No. 2.8e-21; 
Matches 220; Conservative 124; Mismatches 402; Indels 201; Gaps 25; 

Qy 66 FPPRIVEHPSDL1VSKG— EPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLL 122 

I I I llhl I I I I II 

Db 30 YGPVFEEQPAHTLFPEGSAEEKVTLTCRARANPPAT YRWKMNGTELKMGPDS — RYRL 85 

Qy 123 PSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVA 182 

:| I I I I III I I II III |:: : I : 

Db 86 VAGDL — VISNPVKAKDAGSYQCVATNARGTWSREASLRFGFLQEFSAEERDPVKIT 141 

Qy 183 VGEPAVMECQPPRGHPEPTISWKKDGSP- - -LDDKDERITIRGGKLMITYTRKSDAGKYV 239 

I : I || :| : I : I I :: I I I III I I 
Db 142 EGWGVMFTCSPPPHYPALSYRWLLNEFPNFIPADGRRFVSQTTGNLYIAKTEASDLGNYS 201 

Qy 240 CVGTNMVG ERESEVAELTVLERPSF-VKRPSNLAVTVDDSAEFKCEARGD 288 

II:: : I II III:: :|| |: 

Db 202 CFATSHIDFITKSVFSKFSQLSLAAEDARQYAPSIKAKFPADTYALTGQMVTLECFAFGN 261 

Qy 289 PVPTVRWRKDDGELPKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQ 348 
III ::||| II :: : : I |: I I |:| .1 III: h : : 



Db 262 PVPQIKWRKLDGS--QTSKWLSSEPLLHIQNVDFEDEGTYECEAENIKGRDTYQGRIIIH 319 

Qy 349 EPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVS 408 

I :: \ :| : : I hi |:||: I hi II I :l II 

Db 320 AQPDWLDVITDTEADIGSDLRWSCVASGKPRPAVRWLRDG QPLASQNRIEVS 371 

Qy 409 QTGDLT ITNVQRSDVGY Y ICQTLNVAGS I IT KAY LEVTDVI ADBPP PVI RQG PVNQTV - - 466 

hi : ::' I I I I I h: I I I : I I II : : 
Db 372 -GGELRFSKLVLEDSGMYQCVAENKHGTVYASAELTVQALAPD FRLNPVKRLIPA 425 

Qy 467 AVDGTFVLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCI 526 

:| hll I I : I II: :| I :: I hill 

Db 426 ARSGKVIIPCQPRAAPKATVLWTK-GTELLTNSSRVTITADGTLILQNISKSDEGKYTCF 484 

Qy 527 ASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNL 559 

I hi : : h: II h 
Db 485 AENFMGKANSTGILSVRDATKITLAPSSADINVGENLTLQCHASHDPTMDLTFTWSLDDF 544 

Qy 560 IPS 562 

f I 
Db 545 PIDLDKSEGHYRRASVKEAVGDLAIVNAQLKHSGRYTCTAQTWDSTSESATLTVRGPPG 604 

Qy 563 APSKPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEAFSHASGSSWQTVAENV KTE 617 

I I |:{ II III :: : I III : I : h : I I 
Db 605 PPGGVWRDIGDTTVQLSWSRGFDNHSPIARYSIEARTLLS-NKWKQMRTNPVNIEGNAE 663 

Qy 618 TSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTSQGVDHKQVQRELGNAV 677 

I: : I I M I I hi h :|| I ::h: II 
Db 664 TAQWNLIPWMDYEFRVLASNILGVGEPSLPSSK IRT KEAAPTVA 708 

Qy 678 LHLHNPTVL SSSSIEVHTV— DQQSQYIQGYKILYRPSGANHGESDWLVFEVR 729 

|: I x : : : ::ll I h II : :| I II I 
Db 709 PSGLGGGGGAPNELIINWTPTLRDYQNGDGFGYILSFRKKGT— -QGWLT-AR 757 

Qy 730 TPAKNSWIPDLRKGVN-— YEIKARPFFNEFQGADSEIKFAKTLEEAPSAPPQGVTVS 785 

I I: : : :|:| : : : :| :| : II I I II 
Db 758 VPHAESLHYVYRNESIGPYTPFEVKIKAYNRKGEGPESLTAIVYSAEEEPKVAPFRVTAK 817 

Qy 786 KNDGNGTAIL-— VSWQPPPEDTQNGMVQEYKV-WCLGNETRYHINKTVDGSTFSW 838 

1:1 llhl : h: h: I h: II 
Db 818 AVLSSEMDVSWEPVEQGDMTGVLLGYEIRYWKDGDKEEAADRVRTAGLVTSAH 870 

Qy 839 IPFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVS 885 

: I I :l I I I II: I II :: 

Db 871 VTGLNPNTKYEVSVRAYNRAGAG PPSPSTNIT 902 



STANDARD; PRT; 1239 AA. 
P20241; Q24414; ' 
01-FEB-1991 (Rel. 17, Created) 
01-FEB-1991 (Rel. 17, Last sequence update) 
01-OCT-2000 (Rel. 40, Last annotation update) 

NEUROGLIAN PRECURSOR. 
NRG. 

Drosophila melanogaster (Fruit fly) . 

Eukaryota; Metazoa; Arthropoda; Trachea ta; Hexapoda; Insecta; 

Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

Eohydroidea; Drosophilidae; Drosophila. 

fl] 

SEQUENCE FROM N.A., AND SEQUENCE OF 24-41 AND 737-751. 

MEDLINE-90030418; PubMed=2805067; 

Bieber A.J., Snow P.M., Hortsch M., Patel N.H., Jacobs J.R., 

Traquina Z.R., Schilling J., Goodman C.S.; 

"Drosophila neuroglias a member of the immunoglobulin superfamily 

with extensive homology to the vertebrate neural adhesion molecule 

LI."; 

Cell 59:447-460(1989). 
[2] 

SEQUENCE OF 1182-1239 FROM N.A. 

MEDLINE-90262720; PubMed-1693086;" 

Hortsch M., Bieber A.J,, Patel N.H., Goodman C.S.; 
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"Differential splicing generates a nervous system-specific form of 
Drosophila neuroglian."; 
Neuron 4:697-709(1990), 



RT 
RT 
RL 

RN [3] 



X-RAY CRYSTALLOGRAPHY (2,0 ANGSTROMS) OF 610-814. 

MEDLIHE-94213741; PubMed-7512815; 

Huber A.H., Wang Y.-M.E., Bieber A.J., Bjorkman P.J.; 

"Crystal structure of tandem type III fibronectin domains from 

Drosophila neuroglian at 2.0 A.'; 

Neuron 12:717-731(1994). 

FUNCTION: THIS PROTEIN MAY PLAY A ROLE IN NEURAL AND GLIAL CELL 
ADHESION IN THE DEVELOPING DROSOPHILA EMBRYO. 
SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 
ALTERNATIVE PRODUCTS: 2 ISOFORMS; A LONG FORM (SHOWN HERE) AND A 
SHORT FORM; ARE PRODUCED BY ALTERNATIVE SPLICING. 
TISSUE SPECIFICITY: NEURONS AND GLIA IN THE DEVELOPING NERVOUS 
SYSTEM AND ON SOME OTHER NONNEURONAL TISSUES, 
SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN- LIKE C2-TYPE DOMAINS. 
SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS. 

This SWISS-PROT entry is copyright, It is produced through a collaboration 
between the Swiss Institute of Bioinfontiatics and the EMBL outstation - 
the European Bioinformatics Institute, There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement fSee http://ww.isb-sib.ch/announce/ 
or send an email to licenseSisb-sib.cffl. 



: Nrg. 



EMBL; M28231; AAA28728.1; ALT SEQ. 
EMBL; X76243; CAA53822.1; 
PIR; A32579; A32579. 
PDB; 1CFB; 3Q-NOV-94. 
FLYBASE; FBgn0002968; 
INTERPRO; IPR001777; 
INTERPRO; IPR003006; 
PFAM; PF00041; fn3; 5, 
PFAM; PF00047; ig; 6. 

Cell adhesion; Glycoprotein; Transmembrane; Repeat; 3D-structure; 
immunoglobulin domain; Signal; Embryo; Alternative splicing. 
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FT 


DISULFID 


59 


111 


POTENTIAL, 


FT 


DISULFID 


625 


706 




FT 


CARBOHYD 


182 


182 


N-LINKED (GLCNAC. . .) (POTENTIAL). 
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FT 


CARBOHYD 


821 


821 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CARBOHYD 


1125 


1125 


N-LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CONFLICT 


1234 


1234 


T -> Y (IN REF. 2). 


FT 


CONFLICT 


1237 


1237 


L -> K (IN REF. 2). 


SO 


SEQUENCE 


1239 AA; 138284 MW; 49E12692D0DD027D CRC64; 



Query Match 7.2%; Score 627; DB 1; Length 1239; 

Best Local Similarity 22.0%; Pred. No. 7e-21; 

Matches 281; Conservative 148; Mismatches 427; Indels 420; Gaps 49; 



Qy 



58 GSRLRQEDFPPRIVEHPS ■ -DLIVSKGE PATLNCKAEGRPTPTIEWYKGGERV 108 

II : : Nil : I: :|: : I : |:|:|:| I III:: 

19 GSAESKGNRPPRITKQPAPGELLFKVAQQNKESDNPFIIECEADGQPEPEYSWIKNGKKF 78 

109 ETDKDDPRSHRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAIL 168 

: 111 I : I I I I I I I I I I:: : I I 

79 DW0AYDNRMLRQ- -PGRGTLVITI — PKDEDRG HYQCFASNEFGT AT SNSVYVRKAEL 132 

169 RDDFRQNPSDVMVAV -GEPAVMECQPPRGHPEPT I SW 204 

: I: : ■' : II III :::| I I I (I : : I 

133 -NAFKDEAAKTLEAVEGEPFMLKCAAPDGFPSPTVNWMIQESIDGSIKSINNSRMTLDPE 191 



Qy 205 KKD 207 



Db 192 GNLWFSNVTREDASSDFYYACSATSVFRSEYKIGNKVLLDVKQMGVSASQNKHPPVRQYV 251 

Qy 208 - GSPLD-— DKD ERIT- -IRGGKLMITYTRKSD 234 

1 : 1 1 II :lll I 1:1 I I 

Db 252 SRRQSLALRGRRMELFCIYGGTPLPQTWSKDGQRIQWSDRITQGHYGKSLVIRQTNFDD 311 

Qy 235 AGKYVCVGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVR 294 

111:111: I I I I | | I: |:| I I I I : 

Db 312 AGTYTCDVSNGVGNAQSFSIILNVNSVPYFTKEPEIATAAEDEEWFECRAAGVPEPKIS 371 

Qy 295 WRKDDGELPKSRYEIR--DDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQ-EP 350 

I : : :| I |:|::| : || |:| I I I :| I || I 

Db 372 WIHNGKPIEQSTPNPRRTVTDNTIRIINLVKGDTGNYGCNATNSLGYVYKDVYLNVQAEP 431 

Qy 351 PHFVVKPRDQVVALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQT 410 

I I ' MM: |:|:| : I I : I I : |::| 

Db 432 PIISEAPAAVSTVDGRNVTIKCRVNGSPKPLVKWLR-ASNWL TGGRYNVQAN 482 

Qy 411 GDLTITNVQRSDVGYYICQTLN VAGSIITKAYLEVTDVIADRPPPVIRQGPVNQ 464 

III I :l II I I I I II:: I : :| III 

Db 483 GDLEIQDVTFSDAGKYTCYAQNKFGEIQADGSLWKEHTRIT QEPQNY 530 

Qy 465 TVAVDGTFVLSCVATGSPVPTIL- -WRKDGVLV- -STQDSRIKQLENGVLQIRYAKLGDT 520 

11:1 I I III : I :| :| : : :| |: 

Db 531 EVAAGQSATFRCNEAHDDTLEIEIDWWKDGQSIDFEAQPRFVKTNDNSLTIAKTMEL-DS 589 

Qy 521 GRYTCIASTP5GEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLS 580 

I Ml: | ;' III | : ||: :|:|| :| : : 

Db 590 GEYTCVARTRLDEAIARANLIVQD VPNAPKLTGIT-CQADKAEIH 633 

Qy 581 WQPNLNSGATPTSYIIEAFSHASGSSWQTVAENV-KTETSAIKGLKPNAIYLFLVRAANA 639 

I: :: : I |: : : :|| I i |::| : : I I I I I I I 

Db 634 WEQQGDNRSPILHYTIQFNTSFTPASWDAAYEKVPNTDSSFWQMSPWANYTFRVIAFNK 693 

Qy 640 YGISDPSQISDPVKTQDVLP TSQGVD HKQV" 669 

I I II II II :| II : II 

Db 694 IGASPPSAHSDSCTTQPDVPFKNPDNWGQGTEPNNLVISWTPMPEIEHNAPNFHYYVSW 753 

Qy 670 QRELGNAVLHLHN 682 

:|:: I /:| 

Db 754 KRDIPAAAWENNNIFDWRQNNIVIADQPTFVKYLIKWAINDRGESNVAAEEWGYSGED 813 



683 PT- ■ -- - -VLSSSSIEVHWT-VDQQS- -QYIQGYKILYRPSGANHGESDWLVFEV 728 

II : : I: : || I ::| : :|||| : III I 

814 RPLDAPTNFTMRQITSSTSGYMAWTPVSEESVRGHFKGYKI - -QTWTENEGEEGLREIHV 871 

729 RTPAKNSWI - - -PDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPPQGVTVS 785 

: |::|' II : II : : I I I : Mill lh 

872 KGDTHNALVTQFKPDSK- - - NY - ARI LAYNGRFNG PPSAVIDFDT PEGVPS - PVQGL- - - 923 



786 KNDG- • -NGTAILVSWQPP- -PEDTQNGMVQEYKVW CLGNETRY- -HINKTVD 831 

I ;l :: I: II II : II:: :| Ml i 

924 -DAYPLGSSAFMLHWKKPLYP-— NGKLTGYKIYYEEVKESYVGERREYDPHI— TD 974 

832 GSTFSWIPFLVPGIRYSVEVAASTGAGSG 861 

: : >l I :l : : 1:1 II 
975 PRVTRMKMAGLKPNSKYRISITATTKMGEGSEHYIEKTTLKDAVNVAPATPSFSWEQLPS 1034 
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Oy 862 VKSEPQFIQ 870 

: I |:|: 

Db 1035 DNGLAKFRINWLPSTEGHPGTHFFTMHRIRGETQWIRENEEKNSDYQEVGGLDPETAYEF 1094 

Oy 871 — LDAHGNPVSPEDQVSLAQQISDWKQPAFIAG - - IGAACWI I LMVFSI 916 

:| I I I :: :: |: | :| : | | | ::: | 

Db 1095 RWSVDGHFNTESATQEID TNTVEGPIMVANETVANAGWFIGMMLALAFII ILFI 1149 

Oy 917 WLYRHRKKRNGLTSTY AGIRKVPS FTFTPTVTYQRGGEAVSSGGRPGLLN 966 

: I: I I : II! I I ::lll :ll: : 

Db 1150 IICHRRNRGGKYDVHDREIMGRRDYPEEGGFHEYSQPLDNKSAGRQSVSSANKPGVES 1209 



Oy 967 ISEPAAQPWLADTWPN 982 

:: I: II I 
Db 1210 DTDSMAEYGDGDTGMN 1225 



Search completed: January 22, 2001, 12:29:42 
Job time: 1283 sec 

9 



j 
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20 


725,5 £ 


.3 1461 


4 


092859 


Q92859 homo sapien 




GenCore version 4.5 


21 


725,5 £ 


.3 1461 


4 


Q(JUj4U 


O00340 homo sapien 




Copyright (c) 1993 • 2000 


Compugen Ltd. 


22 


718,5 £ 


.2 1100 


4 


094779 


094779 homo sapien 








23 


713 £ 


,2 1026 


4 


O94780 


O94780 homo sapien 








24 


707.5 £ 


.1 1308 


4 


Q9UHI4 


Q9uhi4 homo sapien 


OM protein • protein search, using sw model 




25 


706.5 ( 


.1 1026 


11 


062845 


Q62845 rattus norv 








It 




'\ 


fi 


^051 


015051 homo sapien 


Run on: 


January 22, 2001, 12:53:03 


; Search time 559.88 Seconds 


27 


702.5 ( 


.1 1028 


11 


062682 


Q62582 rattus norv 






(without alignments) 


28 


702 { 


,0 1236 


4 


Q9UHI3 


Q9uhi3 homo sapien 






345.628 Million cell updates/sec 


11 


700.5 £ 


.0 1299 


4 




Q92823 homo sapien 








30 


696.5 £ 


.0 2221 


5 


Q9U1M1 


Q9ulml drosophila 


Title: 


US-09-540-245A-18 








.0 1028 


}} 


QU/4U3 


Q07409 mus musculu 


Perfect score 


8724 




32 


689 5 * 


3 1099 


11 


P97527 


P97527 rattus norv 


Sequence: 


1 MKWKHVPFLVMI SLLSLSPN 


VLGGYERGEDNNEELEETES 1651 


33 


689.5 7 


.9 1166 


11 




Q9qvn4 rattus sp. 
















Q9NBA1 


Q9nbal drosophi/.s 


Scoring table 


BLOSDM62 




35 


684.5 


.8 874 




O01632 


001632 caenorhabcsi 




Gapop 10.0 , Gapext 0.5 




36 


684.5 7 


.8 2016 


5 


Q9V4J9 


yjvtjj UIUoUpilJL„a 


















Q63155 rattus ncr* 


Qarched: 


374700 seqs, 117207915 residues 


38 


58 681 \ 


'\ 1299 


4 


015179 5 


015179 homo sapier. 








39 


679.5 7 


.8 1215 


11 


P97686 


P97686 rattus noiv 


Total number of hits satisfying chosen parameters: 374700 


40 


678.5 7 


.3 2222 


5 


097394 


097394 drosophila 








41 


674 7 


.7 1154 


11 


Q9QVN3 


Q9qvn3 rattus sp. 


Minimum DB sec 


length: 0 




42 


666 7 


.6 1028 


11 


P97528 


P97528 rattus norv 


Maximum DB sec 


length: 2000000000 




43 


664.5 7 


.6 1028 


4 


Q9O052 


Q9uq52 homo sapien 








44 


657.5 7 


.5 1224 


4 


000533 


000533 homo sapien 


j Post-processing: Minimum Match 0% 




45 


653 7 


.5 1028 


11 


Q9JMB8 


Q9jmb8 mus musculu 



Maximum Match 100% 
Listing first 45 summaries 



Pred, No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



Score 


Match tength 


DB 


ID 


Description 


8724 


100.0 


1651 


4 


Q9Y6N7 


Q9y6n7 homo sapien 


8315 


95.3 


1651 


11 


055005 


O55005 rattus norv 


8120 


93.1 


1612 


11 


089026 


089026 mus musculu 


3104 


35.6 


1060 


11 


Q9QZI3 


. Q9qzi3 rattus norv 


2607.5 


29.9 


1344 


11 


Q9Z2I4 


Q9z2i4 mus musculu 


1594 


18.3 


1395 


5 


Q9W213 


Q9w213 drosophila 


1592 


18.2 


1395 


5 


044924 


044924 drosophila 


1515 


17.4 


1273 


5 


044928 


044928 caenorhabdi 


1307.5 


15.0 


859 


5 


Q9VPZ6 


Q9vpz6 drosophila 


1212 


13.9 


823 


5 


Q9VQ10 


Q9vql0 drosophila 


873 


10.0 


285 


4 


043608 


043608 homo sapien 


852 


9.8 


423 


5 


P91572 


P91572 caenorhabdi 


760 


8.7 


1493 


'11 


P97798 


P97798 mus musculu 


758 


8.7 


1259 


11 


Q9QY38 


Q9qy38 mus musculu 


757.5 


8.7 


1443 


13 


Q90610 


Q90610 gallus gall 


756 


8.7 


1822 


4 


Q90LT7 


Q9ult7 homo sapien 


748 


8,6 


1248 


6 


Q9XT41 


Q9xt41 cercopithec 


746.5 


8.6 


1377 


11 


P97603 


P97603 rattus norv 


726.5 


8.3 


1427 


13 


Q91562 


Q91562 xenopus lae 
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SPTREMBL.15:* 






1: sp_archea:* 






2: sp_bacteria:* 


RES 


JLT 1 


3: sp.fungi:* 
4: spjiuman;* 


Q9Y6N7 


ID 


Q9Y6N7 PRELIMINARY; PRT; 1651 AA. 


5: sp_invertebrate:* 


AC 


Q9Y6N7; 


6: spjnammal:* 


DT 


01-NOV-1999 (TrEMBLrel. 12, Created) 


7: spjihc:* 


DT 


01-HOV-1999 (TrEMBLrel. 12, Last sequence update) 


8: sp.organelle:* 


DT 


01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 


9: sp_phage:* 


DE 


ROUNDABOUT 1. 


10: sp_plant:* 


GN 


R0B01. 


11: spjrodent:* 


OS 


Homo sapiens (Human), 


12: sp.virus:* 


OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 


13: sp.vertebrate:* 


OC 


Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 


14: spjinclassified:* 


OX 


NCBI_TaxID-9606; 



RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-98117249; PubMed-9458045; 

RA Kidd T., Brose L, Mitchell K,J., Fetter R.D., Tessier-Lavigne M,, 

RA Goodman C.S., Tear G,; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily of evolutionarily conserved guidance receptors. "; 

RL Cell 92:205-215(1998), 

DR EMBL; AF040990; AAC39575.1; -. 

DR HSSP; P56276; 1TLK . 

DR INIERPRO; IPR001777; •. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

SQ SEQUENCE 1651 AA; 180928 MW; 9D98CD7CAB73074D CRC64 ; 



Query Match 100.01; Score 8724; DB 4; Length 1651; 

Best Local Similarity 100.0*; Pred. No. 0; 

Matches 1651; Conservative 0; Mismatches 0; Indels 0; Gaps 

Qy 1 MKWKHVPFLVMISLLSISPNHLFLAQLIPDPEDVERGNDHGTPIPTSDNDDNSLGYTGSR 60 

IIIIMMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIINIII 
Db 1 MKWKHVPFLVMISLLSLSPNHLFLAQLIPDPEDVERGNDHGTPIPTSDNDDNSLGYTGSR 60 

Qy 61 LRQEDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRM 120 

Db 61 LRQEDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRM 120 
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Qy 

Db 

Qy 

Db 

Qy 
Db 
Qy 

Db 

Qy 
Db 

I 



121 LLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVM 180 
121 LLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVM 180 
181 VAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITIRGGRLMITYTRKSDAGKYVC 240 
181 VAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDRDERITIRGGRLMITYTRRSDAGKYVC 240 
241 VGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDG 300 
241 VGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFRCEARGDPVPTVRWRKDDG 300 
301 EL^KSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPRDQ 360 



301 ELjKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGRAEASATLTVQEPPHFVVKPRDQ 360 

361 W|LGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQR 420 

361 WA>GRTVTFQCEATGt}PQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQR 420 

421 SDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVMG 480 



421 SDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVATG 480 
. t 

481 SPyPTILWRKDGVLVSTQDSRIKOLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSAYI 540 



481 SPyPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSAYI 540 

! 

541 EVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEAFS 600 

541 EybEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEAFS 600 
i 

601 HASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPT 660 

601 HASG^WQ^VAENTO 660 
i 

661 SQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHGE 720 

661 SQpVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHGE 720 

721 SDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIRFARILEEAPSAPPQ 780 

721 SDWLVFEVRTPAKNSVVIPDLRRGVNYEIRARPFFNEFQGADSEIRFARTLEEAPSAPPQ 780 

781 GVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHINKTVDGSTFSWIP 840 

841 FLVPGIRYSVEVAASTGAGSGVKSEPOFIOLDAHGNPVSPEDQVSLAQQISDWKQPAFI 900 
841 FLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVSLAQQISDWKQPAFI 900 
901 AGIGAACWIILMVFSIWLYRHRKKRNGLISTYAGIRKVPSFTFIPTVTYQRGGEAVSSGG 960 
901 AGIGAACWIILMVFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGG 960 
961 RPGLLNISEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYN 1020 
961 RPGLLNISEPAAQPWLADTWPNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYN 1020 
\021 NQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQ 1080 
1021 NQLDNKQTNLMLfESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYAITQLIQ 1080 
1081 SNLSNNMNNGSGDSGEKHWKPLGQOKQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYNQ 1140 
1081 SNLSNNMNNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYNQ 1140 
1141 SYDQNTGGSYNSSDRGSSTSGSQGHKRGARTPRVPRQGGMNWADLLPPPPAHPPPHSNSE 1200 
1141 SYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKVPKQGGMNWADLLPPPPAHPPPHSNSE 1200 
1201 EYNISVDESYDQEMPCPVPPARMYLQODELEEEEDERGPTPPVRGAASSPAAVSYSHQST 1260 



Db 


1201 


Qy 


1261 


Db 


1261 


Qy 


1321 


Db 


1321 


Qy 


1381 


Db 


1381 


Qy 


1441 


Db 


1441 


Qy 


1501 


Db 


1501 


Qy 


1561 


Db 


1561 


Qy 


1621 


Db 


1621 



EYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPTPPVRGAASSPAAVSYSHQST 1260 
ATLTPSPQEELQPMLQDCPEETGHMQHQPDRRRQPVSPPPPPRPISPPHTYGYISGPLVS 1320 
ATLTPSPQEELQPMLQDCPEETGHMQHQPDRRRQPVSPPPPPRPISPPHTYGYISGPLVS 1320 
DMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWGSASEE 1380 
DMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMIHGWGSASEE 1380 
DNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLRVARRQMQDAAGRRHFHASQCPRP 1440 
DNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRP 1440 
TSPVSTDSNMSAAVMQKTRPARRLRHQPGHLRRETYTDDLPPPPVPPPAIRSPTAQSRTQ 1500 
TSPVSTDSNMSAAVMQKTRPAKKLKHQPGHLRRETYTDDLPPPPVPPPAIKSPTAQSKTQ 1500 
LEVRPVWPRIjPSMDARTDRSSDRRGSSYKGREVLDGRQWDMRTNPGDPREAQEQQNDG 1560 
LEVRPVWPRLPSMDARTDRSSDRRGSSYKGREVLDGROWDMRTNPGDPREAQEQQNDG 1560 
KGRGNKAAKRDLPPAKTHLIQEDILPYCRPTFPTSNNPRDPSSSSSMSSRGSGSRQREQA 1620 
KGRGNKAARRDLPPARTHLIQEDILPYCRPTFPTSNNPRDPSSSSSMSSRGSGSRQREQA 1620 
NVGRRNIAEMtJVLGGYERGEDNNEELEETES 1651 
NVGRRNIAEMQVLGGYERGEDNNEELEETES 1651 



PRELIMINARY; 



PRT; 1651 AA, 



RESULT 2 
055005 

ID 055005 

AC 055005; 

DT 01-JUN-1998 (TrRMBLrel. 06, Created) 

DT 01-JUN-1998 (TrRMBUel. 06, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE TRANSMEMBRANE RECEPTOR ROBOl. 

OS Rattus norvegicus (Rat) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostorai; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

OX NCBI_TaxID»10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE-SPINAL CORD; 

RX MEDLINE-98117249; PubMed-9458045; 

RA Kidd T., Brose K., Mitchell R.J., Fetter R.D., Tessier-Lavigne M. , 

RA Goodraan C.S,, Tear G.; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily- ';of evolutionarily conserved guidance receptors."; 

RL Cell 92:205-215(1998). 

DR EMBL; AF041082;- AAC39960.1; -. 

DR HSSP; P56276; 1TLR. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5, 

KW Transmembrane. 

SO SEQUENCE 1651 AA; 180746 MW; FA2452DD46E186B7 CRC64; 



Query Match 95.3%; Score 8315; DB 11; Length 1651; 

Best Local similarity 94.5%; Pred. No. 0; 

Matches 1560; Conservative 42; Mismatches 49; Indels 0; Gaps 

Qy 1 MKWRHVPFLVHISLLSLSPNHLFLAQLIPDPEDVERGNDHGTPIPTSDNDDNSLGYTGSR 60 

Db 1 MKWRHLPLLVMISLLTLSKKHLLLAQLIPDPEDLERGNDNGTPAPTSDNDDNSLGYTGSR 60 

Qy 61 LRQEDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRM 120 
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Db 


61 


LRQEDFPPRIVEHPSDLIVSRGEPATLNCRAEGRPTPTIEWYRGGERVETDRDDPRSHRM 


120 


Qy 


121 


LLPSGSLFFLRIVHGRRSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVM 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 II M 1 1 M M M 1 1 II 1 1 II 1 1 II :| 1 M 1 1 M M M ! 1 1 1 1 1 1 1 II 1 1 1 1 1 M 1 1 1 1 1 




Db 


121 


LLPSGSLFFLRIVHGRKSRPDEGVYICVARNyLGEAVSHNASLEVAILRDDFRQNPSDVM 


180 


Qy 


181 


VAVGEPAVMECQPPRGHPEPTISWKRDGSPLDDKDERITIRGGRLMITYTRKSDAGRYVC 


240 






1 1 II ! 1 1 1 II M 1 1 1 I 1 1 1 I 1 II 1 1 1 1 I 1 I 1 I 1 1 II II II 1 1 II 1 1 1 1 I 1 ! 1 1 I 1 1 1 ! 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


VAVGEPAVMECQPPRGHPEPTISWKRDGSPLDDKDERITIRGGRLMITYTRRSDAGKYVC 


240 


Qy 


241 


i 


300 






1 1 1 1 M 1 1 1 1 1 • 1 1 • • 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 ■ 1 1 <■ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


VGTNMVGERESKVADVTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTFGWRRDDG 


300 


ly 

oy 


,301 


ELPRSRYEIRDDHTLRIRKVTAGDMGSYTCVAENMVGRAEASATLTVQEPPHFVVKPRDQ 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
II 1 1 1 II 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


ELPKSRYEIRDEfflTLRIRROTAGDMGSYTCVAENMVGRAEASATLTVQEPPHFVVKPRDQ 


360 




361 


WALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQR 


420 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 . 1 1 1 1 1 

lllllllllllllllllllllllllllllllllllllllllllllllllllllhlllll 




Db 


361 


WALGRTVTFQCEATGNPQPAIFWBREGSQNLLFSYQPPQSSSRFSVSQTGDLTVTNVQR 


420 


Qy 


421 


SDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVATG 


480 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
lllllllllllllllllllllllllllllllllllllllllllllllllll lllllll 




Db 


421 


SDVGYYICQTLNVAGSIITKAYLEVTDYIADRPPPYIRQGPVNQTVAVDGTLTLSCVATG 


460 


Qy 


481 


SPVPTILWRRDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSAYI 


540 






1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 . 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
IIIIIIIIIIIIIIIIIIIIIIIIIIUIIIIIIIIIIIIlllll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


S PVPT I LWRKDGVLVSTQDS R IKQLESGVLQI RY AKLGDTGRYT CT AST PSG EATWS AY I 


540 


Qy 


541 


LVUlCbVrVUFrKFlUrnljlFsAFollrtVlUVaAWlVlLo^ 


600 






1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 • 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
I 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 • 1 1 II 1 1 1 1 1 II 1 1 I 1 1 1 1 1 1 1 1 1 1 I 




Db 


541 


EVQEFGVPVQPPRPTDPNLIPSAPSRPEVTDVSRNTVTLLWQPNLNSGATPTSYIIEAFS 


600 


Civ 

uy 


601 




660 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


601 


HASGSSWQTVAENVKTETFAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVPPT 


660 


Qy 


661 


Cfirvntnrr\urtTJi?T/ , vi&VT ur umdtw ccccTroutOTVTWioi'ivTfY'virTT vDneriMrarip 


720 






■ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 • 1 1 1 1 1 • 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 . 1 1 1 

■ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 • 1 1 1 1 1 • 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 




Db 


661 


TQGVDHKQVQRELGNWLHLHNPTILSSSSVEVHWTVDQQSQYIQGYKILYRPSGASHGE 


720 


j Qy 


721 


cnuT VFWBTDMTMCWTOnT Dtrf , viJVTi , T7fiDDirrxnrriv , anGT7T7i?hvniT rpRDCRDDfl 
SUVUjVr&VKltmroVVJ.rUbKftuVflIM 


780 






1 ■ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill. 
1 • 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill' 




iDb 

i 


721 


SEWLVFEVRIPTKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFMTLEERPSAPPR 


780 


■ 


781 


GVTVSRNDGNGTAILVSWQPPPEDTQNGMVQEYRVWCLGNETRYHINKTVDGSTFSVVIP 


840 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 < 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
lllllllllllllll'IIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIll 




r 


781 


SVTVSRNDGNGTAILVTWQPPPEDTQNGMVQEYKVWCLGNETRYHINKTVDGSTFSWIP 


840 


Qy 


841 


FLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVSLAQQISDVVRQPAFI 


900 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii ii 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IlllllllllilUIIIIIIIIIIIIIIIIIIIIIIIIII 




Db 


841 


FLVPGIRYSVEVAASTGAGPGVRSEPQFIQLDSHGNPVSPEDQVSLAQQISDWRQPAFI 


900 


Qy 


901 


i^TffiaruTTr Mure tut vdudititbm^t t gt v rp TB7VT}OE"Pi ,f ! i D f nnTi vrtopppMrc err 


960 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 . II II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 • 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


901 


AGIGAACWIILMVFSIWLYRHRKRRNGLSSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGG 


960 


Qy 


961 




1020 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 • 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
I 1 I 1 I 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 M 1 1 1 1 1 1 • 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


961 


RPGLLNISEPATQPWLADIWPNTGNSHNDCSINCCTASNGNSDSNLTTYSRPADCIANYN 


1020 


Qy 


1021 




1080 






1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1021 


NQLDNROINLMLPESTVYGDVDLSNRINEMRTFNSPNLRDGRFVNPSGQPTPYATTQLIQ 


1080 


Qy 


1081 


SNLSNNMNNGSGDSGEKHWKPLGQQRQEVAPVQYNIVEQNKLNKDYRANDTVPPTIPYNQ 


1140 






:ii nun iii linn iiiiiiiihiiihiiimimnii: mm 




Db 


1081 


ANLINNMNNGGGDSSEKHWKPPGQQKQEVAPIQYNIMEQNKLNKDYRANDTILPTIPYNH 


1140 


Qy 


1141 


SYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPRVPRQGGMNWADLLPPPPAHPPPHSNSE 


1200 






mimiimiiiiiiimiiiiiiiiiii iiiiimimmiimimii 




Db 


1141 


SYDQNTGGSYNSSDRGSSTSGSQGHRRGARTPRAPRQGGMNWADLLPPPPAHPPPHSNSE 


1200 



Qy 


1201 


EYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPIPPVRGAASSPAAVSYSHQST 


1260 






! 1 : : 1 M 1 1 1 n 1 M 1 1 1 1 ( 1 1 1 1 1 1 1 1 M ! 1 1 1 MIIIIIMIIIIIIIIIimill 




Db 


1201 


EYSMSVDESYDQEMPCPVPPARMYLQQDELEEEEAERGPTPPVRGAASSPAAVSYSHQST 


1260 


Qy 


1261 


ATLTPSPQEELQPMLQDCPEETGHMQHQPDRRRQPVSPPPPPRPISPPHTYGYISGPLVS 


1320 






llllllimillllimi: III 1 1 1 M 1 1 1 1 1 M 1 1 M J M 1 M 1 1 1 M 1 1 1 1 1 1 




Db 


1261 


ATLTPSPQEELQPMLQDCPEDLGHMPHPPDRRRQPVSPPPPPRPISPPHTYGYISGPLVS 


1320 


Qy 


1321 


DMDTDAPEEEEDEAUMIwAlWQl 


1380 






iiiimiiiiiiiimiiimmiiiiiimiimiimiiiiimiiiiii 




Db 


1321 


DMDTDAPEEEEDEADMEVARMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWGSASEE 


1380 


Qy 


1381 




1440 






iiiiiimmmiimiiii m milium in iimmi minim 




Db 


1381 


DNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLRVARRQMQDAAGRRHFHASQCPRP 


1440 


Qy 


1441 


f 

TSPVSTDSNMSAAVMQRTRPARKLRHQPGHLRRETYTDDLPPPPVPPPAIRSPTAQSRTQ 


1500 






1 1 1 1 1 1 1 ! 1 1 i 1 M : 1 1 II II millllll llllllllllllllllll: III 1 




Db 


1441 


TSPVSTDSNMSAAVIQRARPTRRQRHQPGHLRREAYTDDLPPPPVPPPAIKSPSVQSRAQ 


1500 


Qy 


1501 


LEVRPWVPKLPSMDARTDRSSDRRGSSYRGREVLDGRQWDMRTNPGDPREAQEQQNDG 


1560 






ii 1 1 • ■ 1 1 r i . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i • 1 1 1 1 1 1 1 1 1 ii 1 1 iii 
II II" III Mil 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ■ 1 1 1 ■ II 1 1 1 1 1 1 1 1 Ml 




Db 


1501 


LEARPIMGPKLASIEARADRSSDRRGGSYRGREALDGRQVTDLRTSPGDPREAQEQPNEG 


1560 


Qy 


1561 


RGRGNKAARRDLPPARTHLIQEDILPYCRPTFPTSNNPRDPSSSSSMSSRGSGSRQREQA 


1620 






i ii i miimiiiii imiimiiiiiiiiiiiiimiiiiiimiiiii 




Db 


1561 


KARGTKTARRDLPPARTHLIPEDILPYCRPTFPTSNNPRDPSSSSSMSSRGSGSRQREQA 


1620 


Qy 


1621 


NVGRRMAEMQVLGGYERGEDNNEELEETES 1651 








iiiim.iiiiiiiimmimiimi 




Db 


1621 


NVGRRNMAEMQVLGGFERGDENNEELEETES 1651 





RESULT 3 
089026 

ID 089026 PRELIMINARY; PRT; 1612 AA. 

AC 089026; 

DT 01-NOV-1998 (TrEMBLrel, 08, Created) 

DT 01-NOV-1998 (TrEMBLrel, 08, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE DUTT1 PROTEIN, ' 

GN R0B01 OR DUTTl . 

OS Mus musculus (Mouse). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBIJaxID-10090; 

RN [1] 

RP SEQUENCE FROM N'.A. 

RC TISSOE-BRAIN; ' 

RA Wu M.C., Lowe N./ Fordham R., Rabbitts P.; 

RT "The mouse homologue of human DtJTTl/H-robol gene: protein sequence and 

RT chromosomal location."; 

RL Submitted (JUL-1998) to the EMBL/GenBank/DDBJ databases. ■ 

DR EMBL; Y17793; CAA76850.1; -. 

DR HSSP; P56276; lTLK. 

DR MGD; MGI : 1274781 ; Robol. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

SQ SEQUENCE 1612 AA; 176406 MW; 5F2988C544796B4B CRC64 ; 



Query Match 93.1%; Score 8120; DB 11; Length 1612; 

Best Local Similarity 95.64; Pred. No. 0; 

Matches 1525; Conservative 33; Mismatches 37; Indels 0; Gaps 

Qy 57 TGSRLRQEDF'PPRIVEHPSDLIVSRGEPATLNCRAEGRPTPTIEWYRGGERVETDKDDPR 116 

:i i ii immi inn i urn iii iii imi i ii m iii m mm imm 

Db 18 SGSRLRQEDFPPRIVEHPSDLIVSRGEPATLNCKAEGRPTPTIEWYRGGERVETDRDDPR 77 
Qy 117 SHRMLLPSGSLFFLRIVHGRRSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNP 176 
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Db 


78 


Qy 


177 


Db 


138 


Qy 


237 


Db 


198 


Qy 


297 


Db 


258 


Qy 


357 


Db 


318 


Qy 


417 


1 


378 


1 


477 


Db 


438 


Qy 


'537 


Db 


498 


Qy 


597 


Db 


558 


Qy 


657 


Db 


618 


Qy 


717 


Db 


678 


Qy 


777 


Db 


738 


Qy 


837 


1 


798 


1 


897 


Db 


858 


Qy 


957 


Db 


918 


Qy 


1017 


Db 


978 


Qy 


1077 


Db 


1038 


Qy 


1137 


Db 


1098 


Qy 


1197 



iimiiiiiiiiiiiiiiimimihiiiiiiiiiiiiiiiiiimiiMiiiii 



IIIIIIIIIIIIIMilllMIIIIIIIMMIIINIIIIIIIIIIIIIIIIIIIIIII 



immimmmimmmmmmimmimmmimmiimimmimmiim 

KrVCVGTNMVGERESEVAELTVLERPSFVRRPSNLAVTVDDSAEFKCEARGDPVPTVRWR 257 



lllilllMIMNMIIMIIIIIIIIIIIIIIIINMIIIIIIMIIIIIIIIIIII 



I M M I M 1 1 1 1 M M 1 M M 1 1 1 M 1 1 1 II M 1 1 1 1 M I ! 1 1 1 1 1 II 1 M 1 1 1 M I M I 



iiiimimiiiiiiiiiiiiiimiiiiimiiimimiiiiiiii! mi 



hum mimmmmmmmmimmimmimuiimm niiiiiin 

VATGSPAPTILWRKDGVIVSTQDSRIKQLESGVLQIRYAKLGDTGRYTCTASTPSGEATW 497 
I 

SAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTYTLSWQPNLNSGATPTSYII 596 

IIMIIIIIIIIHIIIIIIMIIIIIIIII!IMI|:|||'IIIIIMIIIIIIII!III 



iMiiiMiiiii mini mimmmiimmiimmmmimmimm 



i iiiiiiiiiiiiiiiii miiiiimimimmmiimmimmiiiiimi 



MIMIMI MMiMMIMMMIIMMMMMIMMMMIMM 

jPDLRRGVNYEIKARPFFNEFQGADSEIRFARTLEEAPS 737 

tQPPPEDTQNGMVQEYKVWCLGNETRYHINKTVDGSTFS 836 

III: MMIMMMMM: lllll!l!llllll!llllllll!:MIII!llll!l! 

?QPPPEDTQNGMVQEYKVWCLGNETKYHINKTVDGSTFS 797 



III! llllllllllllllllll MMMIMMMMMMMIMIMMMMM 



IIIIIMIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIII 



IIIIIIIIIIIMII lllllllllllllllllllhlllllllllllllllllllllll 



IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIM 



iimmmmmmmii linn mi miimimmmimmmmmmim 



iiiiiiiiiiiiiiiiiniinniiiiinnin iiiniiiiiniiiiinni 



mmimmimiimimiiiim nniiimiiiiinininiiiiiiinii 



Db 1158 SNSEEYNMSVPESYDQEMPCPVPPAPMYLQQDELQEEEDERGPTPPVRGAASSPAAVSYS 1217 



Miiiiiuiiiiiiiiiiiiiii: in i niniiiinnniiiiiiiniiii 



1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 11 11 1 1 1 1 II M M 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 11 1 1 1 1 1 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



iiinniiiniin mi inn iiiinim i iiiiiiiiiiiiiiiii i 



II lllllllHlll ::IIIMIIIII llllll MM MM MIMIM 



i ii i ii i :•' i in i ii i ii i n ii mi iiiiiiiiiiiiiiiii iMM mi 



Db 


1158 


Qy 


1257 


Db 


1218 


Qy 


1317 


Db 


1278 


Qy 


1377 


Db 


1338 


Qy 


1437 


Db 


1398 


Qy 


1497 


Db 


1458 


Qy 


1557 


Db 


1518 


Qy 


1617 


Db 


1578 



'! I ' 1 1 1 1 1 M : M ! I ! I ', 1 : 1 1 1 : : 1 1 I 



RESULT 4 
Q9QZI3 



PRT; 1060 AA. 



Q9QZI3 PRELIMINARY; 
Q9QZI3; 

01-MAY-2000 (TrEMBLrel. 13, Created) 
01-MAY-2000 (TrEMBLrel, 13, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
REPULSIVE GUIDANCE RECEPTOR (FRAGMENT). 

Rattus norvegicus (Rat) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 
NCBI TaxID-10116; 
[1] 

SEQUENCE FROM N.'A. 
MEDLINE-99200391; PubMed-10102268; 

Brose K., Bland.K.S,, Wang K.H., Arnott D., Henzel W,, Goodman C.S., 
Tessier-Lavigne : M., Kidd T.; 

"Slit proteins bind Robo receptors and have an evolutionarily 

conserved role in repulsive axon guidance."; 

Cell 96:795-806(1999) . 

EMBL; AF182037;* AAF04558.1; 

HSSP; P56276; 1TLK. 

INIERPRO; IPR001547; -. 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003006; 

PFAM; PF00041; fn3; 3. 

PFAM; PF00047; ig; 5. 

PRINTS; PR00014;' FNTYPEIII. 

PROSITE; PS00659; GLYCOSYL_HYDROL_F5 ; ONKNOWN.l. 

Receptor. 

NONJER 1060' 1060 

1060 AA; 116790 MW; C4BC8C11E8542DA4 CRC64; 



Query Match ' 35.6*; Score 3104; DB 11; Length 1060; 

Best Local Similarity 56.6%; Pred. No. 2.4e-199; 

Matches 613; Conservative 150; Mismatches 252; Indels 68; Gaps 

2y 58 GSRLRQEDFPPRIVEHPSDLIVSKGEPATLNCKAEGRPTPTI EWYKGGERV 108 

I lllllllll: II lh:|lll|:| I I I MM III I I 
Db 21 GXRLRQEDFPPRXVEQPSEVIVSRGRPNTPNWKQRGRPFPTIGKVQRMVKPGWDK 75 
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Oy 109 ETDRDDPR-SHRMLLPSGSLFFLRIVJOKSRPDEGVYVCVARNYIGMSHNASLEVAI 167 

III : : illllllllllllllhhllll llllllllllllll HUM: 
Db 76 --TKDDSKVTQGCLLPSGSLFFLRIVHGRRSKPDEGTYVCVARNYLGEAVSRNASLEVAL 133 

Qy 168 LRDDFRQNPSDVMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITIRGGKLMI 227 

llllllllhlhll llll::lllllllllllll Mil :|:|:|||:|||lllll 
Db 134 LRDDFRQNPTDVWAAGEPAILECQPPRGHPEPTIYWKKDKVRIDEKEERISIRGGKLMI 193 

Qy 228 TYIRRSDAGKYVCVGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARG 287 

: Mill | IIIIMIMM: INN 1 1 1 : 1 : : 1 1 I I |: MMM :| 
Db 194 SNTRRSDAGMYTCVGTNMVGERDSDPAELTVFERPTFLRRPINQWLEDEPAEFRCQVQG 253 

Qy 288 DPVPTVRWRKDDGELPKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTV 347 

II IIIMM MM 1 1 : 1 : 1 1 : 1 1 : 1 : 1 : I |:| Ml III 1 1 1 1 1 1 1 1 
Db 254 DPQPTVRWRRDDADLPRGRYDIKDDYTLRIRRAISADEGTYVCIAENRVGRVEASATLTV 313 

Qy 348 QEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSV 407 

• : II 1 1 1 : 1 ! I i : 1 1 HIM I lll|||:||::|||||||| II I Ml II 
314 RAPPQFWRPRDQIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSV 373 

Qy 408 SQTGDLTITNVQRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVA 467 

i miiimmi iiiin i mih ii mill: mimm iimiim 

Db 374 SPTGDLTITNIQRSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPINQTLA 433 

Qy 468 VDGTFVLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIA 527 

Mil M I III M I MM :| I : I III: :: III MM 
Db 434 VDGTALLKCKATG-PLPVISWLKEGFTFLGRDPRATIQDQGTLQIKNLRISDTGTYTCVA 492 

Qy 528 SIPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNS 587 

:: III HI : : I I I I I : I I II IM III ::|M II 1 1 

Db 493 TSSSGETSWSAVLDVTESGATIS--RNYDTNDLPGPPSRPQVTDVTRNSVTLSWQTG-TP 549 

Qy 588 GATPTS-YIIEAFSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPS 646 

I I I 1 1 1 1 II I : : 1 1 1 1 1 1 MM : : 1 1 : 1 1 1 1 1 1 : 1 1 1 I MM 
Db 550 GVLPASAYIIEAFSQSVSHSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINPQGLSDPS 609 

Qy 647 QISDPVKIQDVLPTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQG 706 

: 1 1 1 1 : 1 1 1 : I : 1 1 1 1 1 M 1 1 M 1 1 : : II IM ::::| MMM MM 
Db 610 PMSDPVRTQDISPPAQGVDHRQVQKELGDVTVRLHNPWLTPTTVQVTWTVDRQPQFIQG 669 

Qy 707 YKILYRPSGANHGESDWLVFEVRTPARNSVVIPDLRRGVNYEIKARPFFNEFQGADSEIK 766 

M:MI : : I : : | : | |: :|:||| MM 11:111111 III I 
Db 670 YRVMYRQTSGLQASTVWQNLDAKVPTERSAVLVNLKKGVTYEIKVRPYFNEFQGMDSESK 729 

' Qy 767 FAKTLEEAPSAPPQGVTV'SKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYH 825 

• M lllllllll III : I |:| III III I I! I : : 1 1 1 1 : 1 1 1 1 1 M I : I 

730 TIRTIEEAPSAPPQSVTVLTVGSHNSTSISVSWDPPPADHQNGIIQEYKIWCLGMETRFH 789 

wj 826 INKTVDGSTFSWIPFLVPGIRYSVEYAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVS 885 

mm : mi i mm mini n mmi i : i n i 

Db 790 INKTVDATIRSWIGGLFPGIQYRVEVAASTSAGVGVKSEPQPIIIGGRNEWITENNNS 849 

Qy 886 LAQQISDWKQPAFIAGIGAACWIILMVFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTP 945 

: : 1 1 : 1 1 1 1 1 1 M 1 1 1 ! I MIMM MMM llll II I II 
Db 850 ITEQITDWKQPAFIAGIGGACWVILMGFSIWLYWRRKKRKGL-SNYA 896 

Qy 946 TVTYQRG-GEAVSSGGRPGLLNISEPAAQPWIADTWPNTGNNHNDCSISCCTAGNGNSDS 1004 

1 mm i mm mm m mimi i i n 

Db 897 -VTFQRGDGGI4SNGSRPGLLNTGDP-NYPWLADSWPATS LPVNNSNSGP 944 

Qy 1005 N-LTTYSRPADCIANYNNQLDNKQTNLMLPESTVYGDVDLSNKINEMRTFNSPNLKDGRF 1063 

I : : I I : I I : I 1 1 : : I : I : I hit 
Db 945 NEIGNFGR-GDVLPPVPGQGD--RTATMLSDGAIYSSIDFTTR TTYNSS 990 

Qy 1064 VNPSGQPTPYATTQLIQSNLSNNMWGSGDSGERHWRPLGQQRQEVAPVQYNIVEQNRLN 1123 

: I milll:: II::: : I : II III :: I" Ml I 
Db 991 -SQITQATPYATTQILH— SNSIHELAVDLPDPQWKSSVQQKSDLMGFAYSLPDQNKGN 1046 

Qy 1124 KDY 1126 

I 

Db 1047 NAY 1049 



5 



Q9Z2I4 



Q9Z2I4 PRELIMINARY; PRT; 1344 AA. 
Q9Z2I4; 

01-MAY-1999 (TrEMBLrel. 10, Created) 
01-MAY-1999 (TrEMBLrel. 10, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
RIG-1 PROTEIN. ' 
RBIG1. 

Mus musculus (Mouse) , 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus, 
NCBIJaxID-10090; 
[1] 

SEQUENCE FROM BiA. 

Yuan S.-S.F., Cox L.A., Dasika G.K., Lee E.Y.-H.P.; 

Submitted (APR-1998) to the EMBL/GenBank/DDBJ databases. 

EMBL; AF060570;-' AAD11628.1; -. 

HSSP; P56276; 1TLK. 

MGD; MGI: 1343102; Rbigl. 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003006; -. 

PFAM; PF00041; fn3; 3, 

PFAM; PF00047; ig; 5. 

SEQUENCE 1344.:AA; 143439 MW; 8B0060341C49CFEA CRC64; 



Query Match > 29.9%; Score 2607.5; DB 11; Length 1344; 

Best Local Similarity 38.2*; Pred. No. 6.5e-166; 

Matches 594; Conservative 204; Mismatches 456; Indels 299; Gaps 3 

Qy 58 GSRLRQEDFPPRIVEHPSDLIVSRGEPATLNCKAEGRPTPTIBWYRGGERVETDKDDPRS 117 

III: II jllll I 11:11:111111 MIMM I Hill I II I ::HI: 
Db 32 GSRVGPEDAMPRIVEQPPDLWSRGEPATLPCRAEGRPRPNIEWYKNGARVATAREDPRA 91 

Qy 118 HRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPS 177 

mmmm niinmiimi imim i i imimmmm 

Db 92 HRLLLPSGALFFPRIVHGRRSRPDEGVYTCVARNYLGAAASRNASLEVAVLRDDFRQSPG 151 
Qy 178 DVMVAVGEPAVMECQPPRGHPEPTISWKKDGSPLDDKDERITIRGGKLMITYTRKSDAGK 237 

mminmii mimi mil i mihimm:: nm 

Db 152 NVWAVGEPAVMECVPPKGHPEPLVTWKKGKIKLKEEEGRITIRGGKLMMSHTFKSDAGM 211 
Qy 238 YVCVGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFRCEARGDPVPTVRWRK 297 

1:11 M I'llll III imiimm I m Ml Ml I : III 

Db 212 YMCVASNMAGERESGAAELWLERPSFLRRPINQWLADAPVNFLCEVQGDPQPNLHWRR 271 
Qy 298 DDGELPRSRY3IRDDHTLKIRRVTAGDMGSYTCVAENMVGRAEASATLTVQEPPHFWRP 357 

mm inn mi i m: i imimi mmi mm ii ii ii 

Db 272 DDGELPAGRYEIRSDHSLWIDQVSSEDEGTYTCVAENSVGRAEASGSLSVHVPPQFVTKP 331 

Qy 358 RDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITN 417 

M II I MM III IMMMM III I I I II I I II 
Db 332 QNQTVAPGANVSFQCETKGNPPPAIFWQKEGSQVLLFPSQSLQPMGRLLVSPRGQLNITE 391 

Qy 418 VQRSDVGYYICQTLNVAGSIITRAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCV 477 

I: I MMI! ::|||||: I ||: I MM III III: : : I I 
Db 392 VRIGDGGYYVCQAVSVAGSILARALLEIKGASIDGLPPIILQGPANQTLVLGSSVWLPCR 451 

Qy 478 ATGSPVPTILWRRDGVLVSTQDSRIRQLENGVLQIRYARLGDTGRYTCIASTPSGEATWS 537 

Ml | | i:M : IM ::|l I I : I I |:|:| : MM: 
Db 452 VIGNPQPNIQWKRDERWLQGDDSQFNLMDNGTLHIASIQEMDMGFYSCVARSSIGEATWN 511 

Qy 538 AYIEVQE-FGVPVQPPRPTDPNLIPSAPSRPEVTDVSRNTVTLSWQPNLNSGATPTSYII 596 

::: II M' M IM ||:|: MMMM III I MM 

Db 512 SWLRKQEDWGr-ASPGPATGPSNPPGPPSQPIVTEVTANSITLTMKPNPQSGATATSYVI 569 

Qy 597 EAFSHASGSSWQTVAENVRTETSAIRGLKPNAIYLFLVRAANAYGISDPSQISDPVRTQD 656 

llll |:|::|:IH: M i I Ml MMM MMM :|:||:||l 
Db 570 EAFSQAAGNTWRTVADGVQLETYTISGLQPNTIYLFLVRAVGAWGLSEPSPVSEPVQTQD 629 

Qy 657 VLPTSQGVDHRQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGA 716 

: I ,: II I : : llll :::| llll I :l :! 
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Db 630 SSLSRPAEDPWKGQRGLAEVAVRMQEPTVLGPRTLQVSWTVDGPVQLVQGFRVSWRIAGL 689 

Qy 717 NHGESDWLWFMTPARNSVVIPDLRRGWEIRARPFFNEFQGADSEIRFARTLEEAPS 776 

: | : :::: I I |: I I :|| : I ||:| Ml! 
Db 690 DQG--SWTMLDLQSPHRQSTVIRGLPPGAQIQIRVQVQGQEGLGAESPFVTRSIPEEAPS 747 

Oy 777 APPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHINKTVDGSTFS 836 

Mill I: ::: 1 1 1 : 1 1 MM: 1 1 : : 1 1 1 1 1 1 : 1 : 1 : 1 : : I I 
Db 748 GPPQGVAVALGGDRNSSVTVSWEPPLPSQRNGVITEYQIWCLGNESRFHLNRSAAGWARS 807 

Qy 837 WIPFLVPGIRYSVEVAASTGAGSGVRSEPQFIQL--DAHGNPVSPEDQVSLAQQISDVV 894 

I Ml I MM || || | | Ml I || ||:::: |: 
Db 808 VTFSGLLPGQIYRALVAAATSAGVGVASAPVLVQLPFPPAAEP-GPEVSEGLAERLAKVL 866 

Qy 895 kqpafiagigaacwiilmsiwlyrhrrrrngltstyagirkvpsfIftptvtyqrgge 954 

::IH:|| III :|: I III :|:| I: I ll : :|| h: 
Db 867 RRPAFLAGSSAACGALLK3FCAALYRRQKQRKELSHYTA SFAYTPAVSFPHSEG 920 

Qy 955 AVSSGGRP- -GLLNISEPAAQPWLADTWPNTGNNHN- -DCSISCCTAGNGNSDSNLTTYS 1010 

I II II III 11111:11: : : : III : II 
)b 921 LSGSSSRPPMGL GPAAYPWLADSWPHPPRSPSAQEPRGSCCPS— NPD 966 



i 



1011 RPADCIANYNNQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQP 1070 

I: I : M : I III: 

967 PDDRYYNEAGISLYL AQTARGANASGEG 994 



Qy 1071 TPYATIQLIQSNLSNNMNNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQMLNKDYRAND 1130 

1:1 hi:: I 
Db 995 PVYSTID PVGEELQ 1008 



Qy 1131 TVPPTIPYNQSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKVPKQGGMNWADLLPPPP 1190 

I I : I I :l I : IMM M I : ::| : MM 

Db 1009 TFHGGFPQHSSGDPSTWSQYAPPEWSEGDSGARG-GQGKLLGKPVQMPSLSWPEALPPPP 1067 

Qy 1191 AHPPPHSNSEEYNISVDESYDQEMPCPVPPARMYLQQDELEEEEDERGPTPPVR 1244 

I: II I :MI: I III 
Db 1068 P SCELSCPEGP EEELKGSSDLEEWCPPVPERSHLV 1102 

Qy 1245 GAASSPAAV SYSHQSTATLTPSPQEELQPMLQDCPEETGHMQHQPDR 1291 

MMI I : || ! 1 1 1 1 1 1 1 1 1 : || | : |: | 
Db 1103 GSSSSGACMVAPAPRDTPSPTSSYGQQSTATLTPSPPDPPQP PTDIPHLHQMP-- 1155 

Qy 1292 RRQPVSPPPPPRPISPPHTYGYISGPLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGL 1351 

II I: I I 

; Db 1156 RRVPLGPSSP 1165 

Qy 1352 EQTPASSVGDLESSVTGSMINGWGSASEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAA 1411 

:| : ::ll II 

Db 1166 LSVSQPALSSHDG 1178 

M; 1412 EYAGLRVARRQMQDAAGR-RHFHASQCPRPTSPVSTDSNMSAAVMQRTRP AK 1462 

W I : || :||| | |:: | ; | | | 

Db 1179 RPVGLGAGPVLSYHASPSPVPSTASSAPGRTRQVTGEMTPPLHGHRARIRK 1229 

Qy 1463 KLKHQPGHLRRETYTDDLPPPPVPPPAIKSPTAQSRTQLEVRPWVPKLPSMDARTDRSS 1522 

III III 111111:111 :: I III: I 

Db 1230 KPKALP- - YRREHSPGDLPPPPLPPPELRDKLALGSA- -GSRQHVFPR ARAQKGE 1280 

Qy 1523 DR-KGSSYKGREVLDGRQVVDMRTNPGDPREAQEQQNDGKGRGNKAMRDLPP 1574 

: II: :l I I MM MM M : I 
Db 1281 ESGAGSASRG PTSSQRGPHPDGKESQ GRGRGLEACRSPNSP 1321 



RESULT 6 
Q9W213 

ID Q9W213 PRELIMINARY; PRT; 1395 AA. 
AC Q9W213; 

DT Ql-MAY-2000 (TrEMBLrel. 13, Created) 

dt 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

dt 01-OCT-2000 (TrEMHrel. 15, Last annotation update) 

DE ROBO PROTEIN. 
GN ROBO. 

OS Drosophila melanogaster (Fruit fly) . 



OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BERKELEY; 

RX MEDLINE-20196006; PubMed-10731132; 

RA Adams M.D., Celniker S.E., Holt R'.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G.\> Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.c, Rogers Y.-H.C, Blazej R.G., Champe M., Pfeiffer B.D., 

RA Wan R.H., Doyle'.C, Baxter E.G., Kelt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson R.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S,, 

RA Borkova D,, Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis R.C, Busam D.A., Butler H., Cadieu E, , Center A., Chandra I,, 

RA cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B. , Delcher A., Deng Z,, Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup'L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin R.J., Evangelista C,C, Ferraz C, Ferriera s., Fleischmann W,, 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 

RA Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M., 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston R.A., Howland T.J., Wei m.-h., ibegwam c, 

RA Jalali M., Ralush F., Rarpen G.H., Ke Z., Rennison J. A., Retchum R.A., 

RA Riramel B.E., Rqdira CD,, Rraft.C, Rravitz S., Rulp D., Lai z,, 

RA Lasko P., Lei Y. , Levitsky A. A., Li J,, Li Z., Liang Y. , Lin X., 

RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A,, 

RA Mount S.M., MoyM,, Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson R.A., Nixon R,, Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert R,, Remington R., Saunders R.D.C., Scheeler F., Shen H, , 

RA Shue B.C., Siden-Riaios I,, Simpson M,, Skupski M.P., Smith T,, 

RA Spier E., Spradling A.C., Stapleton M., Strong R., Sun E,, 

RA Svirskas R., Tector C, Turner R,, Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley R.C., Wu D., Yang S., Yao Q.A., 

RA Ye J,, Yeh R.-F., Zaveri J.S., Zhan M, ( Zhang G., Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003458; AAF46887.1; -. 

DR HSSP; P56276; 1TLR. 

DR FLYBASE; PBgn0005631; robo. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; d : n3; 3. 

DR PFAM; PF00047; -ig; 5, 

DR PRINTS; PR00014; FNTYPEIII. 

SQ SEQUENCE 1395 AA; 151759 MW; 25CED7DEB44F13F0 CRC64; 



Query Match 18.3%; Score 1594; DB 5; Length 1395; 

Best Local Similarity 30.0*; Pred. No. 5.6e-98; 

Matches 419; Conservative 189; Mismatches 482; Indels 308; Gaps 

Qy 68 PRIVEHPSDLIVSRGEPATLNCKAEGRPTPTIEWYRGGERVETDRDDPRSHRMLLPSGSL 127 

111:111:11:1 I llllllll MM |||||:| III :: :|||: M 
Db 56 PRIIEHPTDLWKKNEPATLNCKVEGRPEPT IEWFRDG EPVST • • NEKRSHRVQFRDGAL 113 

Qy 128 FFLRIVHGRKSRPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 187 

II I : hl ; : I I I IMM :|:||| : 1 1 1 : : 1 : 1 1 1 1 1 1 M II II 
Db 114 FFYRTMQGRRI'Q-DGGEYWCVAKNRVGQAVSRHASLQIAVLRDDFRVEPRDTRVARGETA 172 

Qy 188 VMECQPPRGHPEPTISWRRDGSPLDD KDERIT I - RGGKLMITYTRKSDAGK YV 239 

:M Ihl Nil: I III MM M I II hi: I I I 

Db 173 LLECGPPRGIPEPTLIWIRDGVPLDDLKAMSFGASSRVRIVDGGNLLISNVEPIDEGNYK 232 
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Qy 240 CVGTNMVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVP1VRWRKDD 299 

I: hi! IN 1:1 ! :| hi I : : :|| I IN I I IM: 
Db 233 CIAQNLVGTRESSYAKLIVQVRPYFMREPKDQVMLYGQIATFHCSVGGDPPPKVLWRKEE 292 

Qy 300 GELPKSRYEI-RDDHTLKIRKVTAGDMGSnCVAENMVGKAEASATLTVQEPPHFWKPR 358 

MM ! |: :|:| :| I |:| I I I ||: | |:| I ||:| :| 
Db 293 GNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPS 352 

Qy 359 DQWALGRIVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNV 418 

:: I I I I hill h:|| :|l hi M M: I I Ihl 
Db 353 NRKVGLNGVVQLPCMASGNPPPSVFWTREGVSTLMF- - -PNSSHGRQHVAADGTLQITDV 409 

Qy 419 QRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVA 478 

I MM M | : MM: : :MMM: II ||h I I I 

Db 410 RQEDEGYYVCSAPSWDSSTVRVFLQVSS -LDERPPPI IQIGPANQTLPKGSVATLPCRA 468 

Qy 479 TGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYMLGDTGRYTCIASTPSGEATWSA 538 

• Ihl I I I II I M :: !:: M hi III II || :|:| 
469 TGNPSPRIKWFHDGHAVQA-GNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAA 527 

Qy 539 YIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSW— QPNLNSGATPTSYI 595 

• = M I I lh h I hi :||| :::| I I : I 
Db 528 TLTVEKPG-STS^HRAADPSTYPAPPGTPRVLNVSRTSISLRWARSQERPGAVGPIIGYT 586 

Qy 596 IEAFSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQ 655 

MM : I hi I II I hlllll I III II M: M 
Db 587 VEYFSPDLQTGWIVAAQRVGDIQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNTIKTI 646 

Qy 656 DV-LPTSQGVDHRQVQRELGNAVLHLHNPTVLSSSSIEVHWT--VDQQSQYIQGYKILYR 712 

: : I : I : I : : :::|:: : I I :|::| :| h 
Db 647 EADFDAASAKDLSAARTLLTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYK 706 

Qy 713 — PSGANHGESDWLVFEVRTPARNSVVIPDLRKGVNYEIRARPFFNEFQGADSEIRFA 768 

III l:lh MM I! Ill :| I I I 

Db 707 DASVPSAQYHS ITVMDASAESFWGNLRRYTRYEFFLTPFFETIEGQPSNSRTA 760 

Qy 769 KTLEEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKV-WCLGNETRYHIN 827 

I h Mill : : I II I I III II : IM II : I 
Db 761 LTYEDVPSAPPDNIQIGMY ■ 'NQTAGWVRWTPPPSQHHNGNLYGYKIEVSAGNTMKVLAN 818 

Qy 828 RTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVRSEPQFIQLD AH — 874 

h: M lh: I I III : : Ml I hi : :! II 
Db 819 MTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSKPISLFMDPTHHVHPPRAHPSGT 878 



Qy 875 



t 



GNPVSPED-QVSLAQQISDVVKQPAFIAGIGAACHIILMVFSIWL 918 

IM I I : :: :| : I |::::| : I 

879 HDGRHEGQDLTYHNNGN • I PPGDINPTTHKKTIDYLSGP WLMVLVCIVLL 927 

919 YRHRKRRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSGGRPGLLNI 967 

I II : :l : I :| I III 

928 VLVISAAISMVYFKRKHQ MTKELGHLSWSDNEITALNI 966 



Qy 968 SEPAAQPWLADTWPNTGHNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYNNQLDNRQ 1027 

: : I: ::| : :| hi I :: : MM 

Db 967 NSKESL-WI DHHRGWRTADTDKDSGLSESKLLSHVNSSQ • ■ SNYNNS 1010 

Qy 1028 INLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLIQSNLSNNM 1087 

I Mil I lh lllllll :l :: I 

Db 1011 DGGTDYAEVDTRNLTTFYNCRRSPD NPTPYATTMIIGTSSSEIC 1054 

Qy 1088 NNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKLNKDYRAMDTVPPTIPYNQSYDQNTG 1147 

: Ml II 

Db 1055 TKTTSISADK DSGTH 1069 

Qy 1148 GSYNSSDRGSSTSGSQGHRRGARTPRVPRQGG MNWADLLPPPPAHPPPHS- 1197 

h : I : III :|h: Mill MM I 

Db 1070 SPYSDAFAG QVPAVPWRSNYLQYPVEPINWSEFLPPPPEHPPPSST 1116 

Qy 1198 NSEEYNISVDES YDQEMPCPVPPARMY 1224 

I II: I : II :| 

Db 1117 YGYAQGSPESSRRSSKSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVY 1176 

Qy 1225 LQQDELEEEEDERGPTPPVRGAASSP-AAVSYSHQSTATLTPSPQEELQPMLQDCPEETG 1283 



hi :lh h Mh III II 
■ ■ SNPLSAVAGGTQNRYQITPTNQH - - PPQLPAYFATTG 1211 



Qy 1284 HMQHQPDRRR QPVSPPPP-PRP" 1304 

M I: Ml I I 

Db 1212 PGGAVPPNHLPFATQRHAASEYQAGLNAARCAQSRACNSCDALAIPSPMQPPPPVPVPEG 1271 

Qy 1305 -ISPPHTYGYISGPLVSD 1321 

II : ' I I: 
Db 1272 WYQPVHPNSHPMHPTSSN 1289 



RESULT 7 
044924 

ID 044924 PRELIMINARY; PRT; 1395 AA. 

AC 044924; 

DT 01-JtJN*1998 (TrEMBLrel. 06, Created) 

DT 01-JUH-199B (TrEMBLrel. 06, Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 

DE ROUNDABOUT 1. •' 

GN ROBOl. 

OS Drosophila melariogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Trachea ta; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI TaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-98117249; PubMed-9458045; 

RA Ridd T., Brose K., Mitchell K.J., Fetter R.D., Tessier-Lavigne M., 

RA Goodman C.S., Tear G.; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily of evolutionary conserved guidance receptors."; 

RL Cell 92:205-215(1998) . 

DR EMBL; AF040989; AAC38849.1; ■. 

DR HSSP; P56276; 1TLK. 

DR FLYBASE; FBgn0005631; robo. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5, 

DR PRINTS; PR00014;' FNTYPEIII. 

SQ SEQUENCE 1395 AA; 151778 MW; B820E234A5218983 CRC64; 



Query Match 4 18.2%; Score 1592; DB 5; Length 1395; 

Best Local Similarity 30.04; Pred. No. 7.6e-98; 

Matches 420; Conservative 187; Mismatches 483; Indels 308; Gaps 40; 

Qy 68 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDRDDPRSHRMLLPSGSL 127 

llhllhlhl I IIIIMI Ihl MMM || | I :: MM: M 
Db 56 PRIIEHPTDLWRKNEPATLNCKVEGKPEPTIEWFRDGEPVST--NEKKSHRVQFKDGAL 113 

Qy 128 FFLRIVHGRR5RPDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPA 187 

II M hi !: I I I MM MMM :|lh:hllllll II II III 
Db 114 FFYRTMQGKKEQ-DGGEYWCYAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETA 172 

Qy 188 VMECQPPRGHPEPTISWKKDGSPLDD KDERITI-RGGKLMITYTRKSDAGKYV 239 

::|| Ihl I'llh I III Ml h I II IM I I I 

Db 173 LLECGPPRGIPEPTLIWIRDGVPLDDLRAMSFGASSRVRIVDGGNLLISPEPIDEGNYR 232 

Qy 240 CVGTNMVGER5SEVAELTVLERPSFVRRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDD 299 

I: MM Ml hi I :| hi M : MM III I I hM 
Db 233 CIAQNLVGTRESSYARLIVQVRPYFMREPRDQVMLYGQTATFHCSVGGDPPPKVLWKKEE 292 

Qy 300 GELPRSRYEI-RDDHTLRIRRVTAGDKGSYTCVAENMVGKAEASATLTVQEPPHFWRPR 356 
> : I :| II l h MM M I hi I I I lh I hi I Ihl :l 
Db 293 GNIPVSRARILEDERSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRPS 352 

Qy 359 DQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNV 418 

:: I I I I hill MM :|| |:| I I M M Ihl 
Db 353 NRKVGLNGWQLPCMASGNPPPSVTWTREGVSTLMF— PNSSHGRQYVAADGTLQITDV 409 
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419 QRSDVGynCQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTPVLSCVA 478 

I \\\:\ = 1 I : MM I II III: I I I 

410 RQEDEGYYVCSAFSWDSSTVRVFLQVSSV-DERPPPIIQIGPANQTLPKGSVATLPCRA 468 

479 TGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSA 538 

1:1 IIIIM :| |:: :| |:| III II II :|:| 

469 TGNPSPRIKWFHDGHAVQA-GNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAA 527 

539 YIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSW— QPNLNSGATPTSYI 595 

: I:: I I lh h I hi :||| :::| II: I 
528 TLTVEKPG • STSLHRAADPSTYPAPPGTPKVLNVSRTS ISLRWAKSQEKPGAVGPI IGYT 586 

596 IEAFSHASGSSWQTVAENTOETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQ 655 

:| II : I I I I II I MINI I III II M :ll 
587 VEYFSPDLQTGWIVAAHRVGDTOVTISGLTPGTSYVFLVRAENTQGISVPSGLSPIKTI 646 

656 DV-LPTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHWT--VDQQSQYIQGYKILYR 712 

: : I : I : I : : :::|:: : I I :|::| :| |: 
647 EADFDAASANDLSAARTLLTGKSVELIDASAINASAVRLEWMLHVSADEKYVEGLRIHYK 706 

713 PSGANHGESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFA 768 

III I : I I: Ml II M| :| II I 

707 DASVPSAQYHS ITVMDASAESFWGNLKKYTKYEFFLTPFlJlEGQPSNSKTA 760 

769 KTLEEAPSAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKwRjCLGNETRYHIN 827 

I I: lllll : : Ml I I III II : III II : I 
761 LTYEDVPSAPPDNIQIGMY--NQTAGWVRWTPPPSQHHNGNLYGYK1VSAGNTMKVLAN 818 

828 KTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLD-I AH — 874 

I:: :l II:: I I MM : MM hi : :| J II 
819 MTLNATTTSVLLNNLTTGAVYSVRLNSFTKAGDGPYSRPISLFMDPllHVHPPRAHPSGT 878 

*875 GNPVSPED-QVSIAQQISDVVKQPAFIAGIGAACWIILMVFSIWL 918 

* II : M : :: :| : I |::::| : I 

879 HDGRHEGQDLTYHNNGN-IPPGDINPTTHKKTTDYLSGP WLMVLVCIVLL 927 



Qy 919 YRHRKKRNGLTSTYAGIRKVPSPTFTPTVTYQRGGEAVSSGGRPGLLNI 967 

MM : I : I : I I III 

Db 928 VLVISAAISMVYFKRKHQ MTKELGHLSWSDNEITALNI 966 

Qy 968 SEPAAQPWLADTWPNTGNNHNDCSISCCTAGHGNSDSNLTTYSRPADCIANYNNQLDNRQ 1027 

: : h ::| : :| hi I :: : HIM 

DD 967 NSKESL-WI DHHRGWRTADTDKDSGLSESKLLSHVNSSQ - -SNYNNS 1010 

Qy 1028 TNLMLPESTVYGDVDLSNKINEMRTFNSPNLKDGRFVNPSGQPTPYATTQLIQSNLSNNM 1087 

I I :|| I II: IIIIM :| :: I 

Db 1011 DGGTDYAEVDTRNLTTFYNCRKSPD NPTPYATTMIIGTSSSEIC 1054 

Qy 1088 NNGSGDSGEKHWKPLGQQKQEVAPVQYNIVEQNKIiNKDYRANDTVPPTIPYNQSYDQNTG 1147 

»: Ml II 

1055 TKTTSISADK DSGTH 1069 

Qy 1148 GSYNSSDRGSSTSGSQGHKKGARTPKVPKQGG MNWADLLPPPPAHPPPHS- 1197 
h : I Mil :lh: lllll Ml I 

Db 1070 SPYSDAFAG QVPAVPWKSNYLQYPVEPINWSEFLPPPPEHPPPSST 1116 

Qy 1198 NSEEYNISVDES YDQEMPCPVPPARMY 1224 

I lh I : II :| 

Db 1117 YGYAQGSPESSRKSSKSAGSGISTNQSILNASIHSSSSGGFSAWGVSPQYAVACPPENVY 1176 



Qy 1225 
Db 1177 ■ 
Qy 1284 ■ 



'-AAVSYSHQSTATLTPSPQEELQPMLQDCPEETG 1283 

hi :lh h :lh III II 

'SNPLSAVAGGTQNRYQITPTNQH- -PPQLPAYFATTG 1211 



■-HMQHQPDRRR QPVSPPPP-PRP-* 1304 

h I I: Ml I I 

Db 1212 PGGAVPPNHLPFATQRHMSEYQAGLNAARCAQSRACNSCDALATPSPMQPPPPVPVPEG 1271 

Qy 1305 -ISPPHTYGYISGPLVSD 1321 

M : I h 
Db 1272 WYQPVHPNSHPMHPTSSN 1289 



PRELIMINARY; PRT; 1273 AA. 



044928 

ID 044928 

AC 044928; 

DT 01-JON-1998 [TrEMBLrel. 06, 

DT Ql-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

DT 01-JDN-2000 (TrEMBLrel. 14, Last annotation update) 

DE SAX-3. 

GN SAX-3. 

OS Caenorhabditis elegans. 

OC Eukaryota; Metasoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peioderinae; Caenorhabditis. 

OX NCBIJaxID-6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-98117250; PubMed-9458046; 

RA Zallen J. A,, Yi.B.A. , Bargmann C.I.; 

RT "The conserved immunoglobulin superfamily member SAX-3/Robo directs 

RT multiple aspects of axon guidance in C. elegans."; 

RL Cell 92:217-227(1998). 

DR EMBL; AF041053 ; ; AAC38848.1; -. 

DR HSSP; P56276; 1TLK, 

DR INTERPRO; IPR001777; -, 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

DR PRINTS; PRQ0014; FNTYPEIII. 

SQ SEQUENCE 1273 -AA; 139427 MW; 013E766B51A7BAD7 CRC64 ; 



Query Match 17.4*; Score 1515; DB 5; Length 1273; 

Best local Similarity 33.5%; Pred. No. 9.5e-93; 

Matches 401; Conservative 164; Mismatches 469; Indels 164; Gaps 34; 

Qy 68 PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 127 

I hill I : : 1 1 : 1 Mill h I I III h I hh lll::l Mi 
Db 31 PVIIEHPIDWVSRGSPATLNCGAK-PSTAKITWYKDGQPVITNKEQVNSHRIVLDTGSL 89 

Qy 128 FFLRIVHGRKSR - PDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEP 186 

I h: hM III III I II h 1 1 : : 1 : 1 1 : M I I I II 
Db 90 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGEM 149 

Qy 187 AVMECQPPRGHPEPTISWKKDGSPLDDKD-ERITIRG-GKLMITYTRKSDAGKYVCVGTN 244 

Ihll Ml' III MM I :! I I: I hi :|hl I II I 
Db 150 AVLECSPPRGFPEPWSWRKDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 209 

Qy 245 MVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDGELPK 304 

MM I M:| I : I I : I : : I I : M III M h: : :l 
Db 210 MVGERVSNPARLSVFEKPKFEQEPKDMTVDVGAAVLFDCRVTGDPQPQITWKRKNEPMPV 269 

Qy 305 SR-YEIRDDHTLKIRKVTAGDMGSYTCVAENMVGKAEASATLTVQEPPHFWKPRDQWA 363 

: I :|: Ml :| I I I I I I I Ml IIIIM II II I 
Db 270 TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVP 329 

Qy 364 LGRTVTFQCEATGNPQPAIFWRREGSQNLLF-SYQPPQSSSRFSVSQTGDLTITNVQRSD 422 

I I Ml II II II :| hill II : I II II III M | 
Db 330 AGGTATFECILVGQPSPAYFWSKEGQQDLLFPSY-VSADGRTKVSPTGTLTIEEVRQVD 387 

Qy 423 VGYYICQTLWAGSIITKAYLEVTDVIAD RPPPVIRQGPVNQTVAVDGTFVLSCV 477 

I hi :| III ::N hll :HI I I III: I : :| I 

Db 388 EGAYVCAGMNSAGSSLSKAALKVTTKAVTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQ 447 

Qy 478 ATGSPVPTILVIRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWS 537 

hi Mil :lh : Ml I Ml I III Mil MM 
Db 448 ASGRPTPGISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGESTWS 507 

Qy 538 AYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATP-TSYII 596 

I : h: I I lh lhh:| : M Ml lllll III 
Db 508 ASLTVEDHTSHAQFVRMPDPSNFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITGYII 567 

Qy 597 EAFSHASGSSWQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKT- 654 

: :l I :.\ : M : 1 1 1 1 1 h M:li Ml II I I I 
Db 568 QYYSPDLGQTWFNIPDYVASTEYRIKGLKPSHSYMFVIRAENEKGIGTPSVSSALVTTSK 627 
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655 - - -QDVLPTSQGVDHKQVQREL-GNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKIL 710 

II :| :: | :: | ::|::: : I : : I II I 
628 PAAQVALSDKNKMDMAIAEKRLTSEQLIKLEMTINSTAVRLFWKKRKLEELIDGYYIK 687 

711 YR-PSGANHGESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPF — FNEFQGADSEIK 766 

:l I I : I :|: : I: :| II! |: : II I 
688 WRGPPRTNDNQ — rVNVTSPSTENYWSNLMPFTNYEFFVIPYHSGVHSIHGAPSNSM 743 

767 FAKTLEEAPSAPPQGVTVSKHDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHI 826 

I I II II: I : II: :||: I I ||::: ::: :| : 
744 DVLTAEAPPSLPPEDVRIRML--NLTTLRISWKAP!U\DGINGILKGFQIVIVGQAPNNNR 801 

827 NRTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVSL 886 

M: II : II I: I : III : I II :|| :| :| 

802 NITTNERAASVTLFHLVTGMTYKIRVAARSNGGVGV SHGTSEVIMNQDTL 851 

887 AQQISDWKQPAFIAGIGAACWI I LMVFS IWLYRHRKKRNGLTSTYAG IRKVP 939 

: :: : :|: h ,: ||::| : : : II I I 
852 EKHLAAQQENESFLYGLINKSHVPVIVIVAILIIFWIIIAYCYWRNSRNSD — GKDR 907 



Qy 940 SFTFTPTVTYQRGGEAVSSGGRPGLLNISE-PAAQPWLADTWPNTGNNHNDCSISCCTAG 998 

II : I ::| ! I :::: II I II I :: I 

Db 908 SF IKINDGSVHMASN-- ; -NLWDVAQNPNQNPMYNTAGRMTMNNRHGQALYSLTPN 959 

Qy 999 NGNSDSNLTTYS — RPADCIANYNNQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFN 1054 

li : II II II II I: :: 

Db 960 AQDFFNNCDDYSGTMHRPGSERHYHYAQLTGGPGNAM— STFYG NQYHD 1006 

Qy 1055 SPNLKDGRFVNPSGQPTPYATTQLIQSNLSNNMNNGSGDSGEKHWKPLGQQKQEVAPVQY 1114 

hlllll I: II II I 
Db 1007 DPSPYATTTLVLSN QQ PAWL 1026 

Qy 1115 NIVEQNKLNKDYRANDTVPPTIPYNQSYDQNTGGSYNSSDRGSS TSGS 1162 

I I : III I I :| I : I I I III 
Db 1027 N- • -DKMLRAPAMPTNPVPPEPP- - ARYADHTAGRRSRSSRASDGRGTLNGGLHHRTSGS 1081 

.Qy 1163 Q GHKK GARTPKVPKQGGMNWADLLPPPPAUPPP 1195 

I M III h I I : M f I : : II I 

Db 1082 QRSDSPPHTDVSYVQLHSSDGTGSSKERTGERRTP- -PNKTLM- - -DPIPPPPSNPPP 1134 



RESULT 9. 
< Q9VPZ6 

' ID Q9VPZ6 PRELIMINARY; PRT; 859 AA. 
Q9VPZ6; 

A 01-MAY-2000 (TrEMBLrel. 13, Created) 

W OI-may-2000 (TrEMBLrel. 13, Last sequence update) 

Pdt Ol-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE CG5423 PROTEIN (FRAGMENT). 

|gN CG5423. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI TaxID=7227; 

RN [1] 

RP SEQUENCE FROM N.A, 

RC STRAIN-BERKELEY; 

RX MEDLINE-20196006; PubMed-10731132; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R , A , , Lewis S.E., Richards S . , Ashburner M. , Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

ra Brandon R.c, Rogers Y.-H.C, Blazej R.6., Champe M., Pfeiffer B.D., 

ra wan K.H., Doyle C, Baxter E.G., Helt G,, Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J,, Andrews -Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Ben<53 P.V., Berman B.P., Bhandari D,, Bolshakov S,, 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C, Busam D.A., Butler H., Cadieu E,, Center A., Chandra I,, 

RA Cherry J,M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng z., Mays A.D., Dew I,, Dietz S.M., 



RA Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W,, 

RA Fosler C, Gabrielian A.E., Garg N,S,, Gelbart W.M., Glasser K,, 

RA Glodek A., Gong' P., Gorrell J.H., Gu Z., Guan P., Harris M, , 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A., Howland T.J., Wei M.-H., Ibegwam c, 

RA Jalali M./ Kalush F., Karpen G.H., Ke z., Kennison J. A., Ketchum K.A., 

RA Kimmel.B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai z., 

RA Lasko P., Lei Y., Levitsky A. A,, Li J,, 'Li Z., Liang Y., Lin X,, 

RA Liu X., Mattel ; B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Mi'lshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy; M. , Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K,A,, Nixon K., Nusskern D.R., Pacleb J.M., | 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G, , 

RA Reinert K., Remington K. , Saunders R.D.C, Scheeler F., Shen H., | 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T., | 

RA Spier E., Spradling A.C, Stapleton M. , Strong R., Sun E., 

RA Svirskas R., lector C, Turner R,, Venter E. , Wang A.H., Wang X.,' 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M,, Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C, Wu D., Yang S,, Yao Q.A., 

RA Ye J., Yeh R.-F.;, Zaveri J.S., Zhan M., Zhang C, Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A,, Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185,-2195(2000). 

DR EMBL; AE003586; AAF51388.1; -. 

DR HSSP; P56276; 1TLK. 

DR FLYBASE; FBgn0031328; CG5423. 

DR INTERPRO; IPR001777; 

DR INTERPRO; IPR003006; 

DR PFAM; PF00041; fn3; 3 

DR PFAM; PF00047; ig; 5. 

FT NONJER 1 

859 AA; 9: 



916 MW; 5CFD69D984101BF8 CRC64; 



Query Match • 15,0*; Score 1307.5; DBS; Length 859; 

Best Local Similarity 33.94; Pred. No. 4.2e-79; 

Matches 300; Conservative 152; Mismatches 345; Indels 87; Gaps 22 



PRIVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 127 

iiiiiii i 'i : minimi mmm i ; i 1 in: im i 

PRIVEHPIDTTVPRHEPATLNCKAEGSPTPTIQWYKDGVPL- ■ -KILPGSHRITLPAGGL 57 



Qy 


68 


Db 


1 


Qy 


128 


Db 


58 


Qy 


188 


Db 


91 


Qy 


245 


Db 


150 


Qy 


303 


Db 


210 


Qy 


363 


Db 


270 


Qy 


419 


Db 


.328 


Qy 


478 


Db 


387 


Qy 


537 



III:: lh ; 



immi I : :| I: I 
--CAVLRDEFRLEPQNTRIAQGDTA 90 



VMECQPPRGHPEPTISWKKDGSPLD- ■ -DKDERITIRGGKLMITYTRKSDAGKYVCVGIN 244 

::H III lll|::l!l I II I II : II I I |::| |:| |: I 



II III :M I :l :: I : I III . 1 1 : 1 I I 



:!;:!:: :!l I I hi l:MI I III! II |: :| : I 



II :|:'/ II: III 



1:1 II II I : 



■-LTITNV 418 

11:1 



11:1 ■! : II :: I : I llllhl IIMII 



I I I lll?l :ll: I m 



:| I 



I I Ml:!,: 



537 SAYIEVQEFGVPVQPPRPTDPNL- ■ 
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Db 


446 


Qy 


587 


Db 


497 


Qy 


646 


Db 


557 


Qy 


697 


Db 


617 


Qy 


757 


Db 


674 


Qy 


816 


1 


732 




872 


Db 


792 



11:11: MM :|:: : : : :|: I : 

■ -LPTNPNIKFYRAPEQTKCPSAPGQPKILNATASALTIVWPTSDK 496 



:| =1 



I I 1:1:11! hill 



I I : :MI: I III h 



:| :| II II II: 



I II I Ml:: I I 



!LG NETRYHINKTVDGSTFSWIPFLVPGIRVSVEVftASTGAGSGVKSEPQFIQL 871 

I I II I hi : :::: I |: I : 1 1 1 : 1 I I |:| ::: 



h I : I II :h lh : 



RESULT 10 
Q9VQ10 

ID Q9VQ10 PRELIMINARY; PRT; 823 AA, 

AC Q9VQ10; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update)' 

DT 01-OCT-2000 (TrEMBLrel , 15, Last annotation update) 

DE CG5481 PROTEIN, 

GN CG5481. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI TaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A, 

RC STRAIN-BERKELEY; 

RX MEDLINE-2Q196QQ6; PubMed-10731132; 

RA Adams M.D., Celniker S.E. ( Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.w,, Hoskins R.A., Galle R.F., 

ra George R.A., Lewis S.E., Richards S., Ashbumer M., Henderson S.N., 

« Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 
Brandon R.C., Rogers Y.-H.C, Blazej R.G., ChampeM., Pfeiffer B.D., 
Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch c, Baldwin D., 

RA Ballew R.M., Basu A,, Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P,, Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B,, Davies P., 

RA de Pablos B., Delcher A. , Deng z,, Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup I.E., Downes M, , Dugan-Rocha s., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W., 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., 

RA Glodek A., Gong p., Gorrell J.H., Gu Z., Guan P., Harris M., 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A., Howland T.J., Wei M.-H,, Ibegwam C, 

RA Jalali M. ( Kalush F., Karpen G.H., Re z., Kennison J.A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S,, Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A. A., Li J., Li z., Liang Y., Lin X., 

RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M., Murphy B., Murphy L. , Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M,, 

'dm,, Pittman G.S., Pan S., Pollard J., Puri v., Reese M.G., 



RA Reinert K., Remington K., Saunders R.D.C., Scheeler F., Shen H,, 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T., 

RA Spier E., Spradling A,C, Stapleton M., Strong R., Sun E., 

RA Svirskas R., lector C, Turner R., Venter E., Wang A.H., Wang X,, 

RA Wang Z.-Y., Wassarman D,A, , Weinstock G.M., Weissenbach J,, 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J,, Yeh R.-F,, Zaveri J.S., Zhan M., Zhang G,, Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A,, Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003586; : AAF51373.1; -. 

DR HSSP; P56276; 1TLK . 

DR FLYBASE; FBgn0031341; CG5481, ■ 

DR INTERPRO; IPR001412; -. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 1. 

DR PFAM; PF00047; ig; 5, 

DR PRINTS; PR00014) FNTYPBIII. 

DR PROSITE; PS00178; AA_TRNA_LIGASE_I; UNKNOWN.1. 

SO SEQUENCE 823 AA; 89715 MW; 36FC0B91F36F2F19 CRC64; 



Query Match 13,9%; Score 1212; DB 5; Length 823; 

Best Local Similarity 33,4%; Pred. No. 9,8e-73; 

279; Conservative 127; Mismatches 273; Indels 156; Gaps 21; 



I IM Ml llllhhl I ::|| |||::||:| III::: 



h I I I I hi I I I 1 1 : 1 : 1 1 llhl h: II II hill II 



GHPEPTISWKKDGSPLD- - -DKDERITIRGGKLMITYTRKSDAGKYVCVGTNMVGERESE 252 
I Ml M I : I : I |: : II : II I I |:|| hi I [:|| III 



I I I II :: I I I I hi llhl I lh I :h I 



:| :lh II III III hi II h III II lh:|::|:| :| I I 



Matches 


Qy 
Db 


76 
2 


Qy 


136 


Db 


59 


Qy 


196 


Db 


118 


Qy 


253 


Db 


177 


Qy 


311 


bb 


236 


Qy 


371 


Db 


296 


Qy 


427 


Db 


353 


Qy 


486 


Db 


412 


Qy 


544 


Db 


472 


Qy 


593 


Db 


523 


Qy 


653 


Db 


583 


Qy 


687 



:hl hhli::| lh =11 I I hi 



--LTITNVQRSDVGYY 426 

hi III 



II lh 



II: : I: 



I I : ilhl IIMIII: I || I 



1:111 



I I IM :h::H h 



PPRPTDPNL IPSAPSKPEVTDVSRNTVTLSW-QPNLNSGATPT 592 

1. 1 : 1 1 : I I lh: : Mill : I M 

- -TPTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTLSWTRSNKVGGSSLV 522 



IM |::|:| II :|:|: 



-LPT- 

h 



■ -SQGVDHKQVQRE -LGNAVLHLHNPTVL 686 

: hi : : I h I I :|: 



Best Available Copy 
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1:1::: I I : :|::|: : I 

Db 643 DSTSMKLTWQVCNRLTDGSIAAPHSIMRHLIRSASFLMQIINGKYVEGFYVYARQLPNP 702 

Qy 713 PSGANHGESDWLVFEVRTP- 731 

: II:: I 

Db 703 IVNNPAPVTSHTNPLLGSTSTSASASASASALISTKPNIAAAGKRDGETNQSGGGAPTPL 762 

Qy 732 ARNSVVIPDLRRGVNYEIKARPFFNEFQGADSEIKFAKTLEE 773 

:| I I : II II: :| I : hill: 
Db 763 NTRYRMLTILNGGGASSCTITGLVQYTLYEFFIVPFYKSVEGKPSNSRIARTLED 817 



PRT; 285 AA. 



RESULT 11 
043608 

ID 043608 PRELIMINARY; 
AC 043608; 

DT 01-JUN-1998 (TrEMBLrel. 06, Created) 

01-JUN-1998 (TrEMBLrel, 06, Last sequence update) 
01-MAY-2000 (TrEMBLrel. 13, Last annotation update) 
ROUNDABOUT 2 (FRAGMENT), 



Homo sapiens (Human), > 
Eukaryota; Metazoa; Chordata; Craniata; Vertebrate; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; feomo, 
NCBI_TaxID«9606; 
[1] 

SEQUENCE FROM N,A, 
MEDLINE=98117249; PubMed-94158045; 

Kidd T., Brose K., Mitchell] K.J., Fetter R.D., Tessier-Lavigne M., 
Goodman C.S., Tear G.; | 

"Roundabout controls axon cjrossing of the CNS midline and defines a 

novel subfamily of evolutionarily conserved guidance receptors."; 

Cell 92:205-215(1998). 

EMBL; AFQ40991; AAC39576.1; -. 

MTERPRO; IPR001777; -. 

INTERPRO; IPR003&86; -. 

PFAM; PF00041; fn3; 1, 

PFAM; PF00047; ig; 2, 

NONJER 1 1 

NONJER 285 285 ! 

285 AA; 30606 MW; 05DF916A3DBA96C6 CRC64; 



Query Match 10.0%;. Score 873; DB 4; Length 285; 

Best Local Similarity 59.1%;. Pred. No. le-50; 
.Matches 165; Conservative ,40; Mismatches 72; Indels 2; 



■ 360 QWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQ 419 

' 1:11 llllll II llllhlMIIIMIII II I :|| I'll ||||||||:| 
Db 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

•Qy 420 RSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVAT 479 

III llllll I Mill: II llllll: !IM: IN MINI :| I II 
Db 61 RSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTIAVDGTALLKCKAT 120 

Qy 480 GSPVPT ILWRKDGVLVSTQDSRI KQLENGVLQ IRYAKLGDTGRYTCIASTPSGEATWSAY 539 

1:1 I 1:1 :| I II III: :: III MM:: ||||:||| 
Db 121 GDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 

Qy 540 IEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEAF 599 

::| I I : : I : :| 1 1 1 1 : 1 1 1 1 :: 1 : 1 1 1 1 1 1 1 ::llllll 
Db 181 LDVTESGATIS--RNYDLSDLPGPPSRPQVTDVTRNSVTLSWQPGTPGTLPASAYIIEAF 238 

Qy 600 SHASGSSWQTVAENVKTETSAIRGLKPNAIYLFLVRAAN 638 

I : Mill! :||| ::||:|| MIMII I 
Db 239 SQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAIN 277 



RESULT 12 
P91572 

ID P91572 PRELIMINARY; PRT; ( 
AC P91572; 

DT 01-MAY-1997 (TrEMBLrel. 03, Created) 



DT 01-MAY-1997 (TrEMBLrel. 03, Last sequence update) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last annotation update) 

DE SIMILAR TO THE IMMUNOGLOBULIN SUPERFAMILY. 

GN ZK377.3. 

OS Caenorhabditis elegans. 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX KCBIJaxID-6239; 

RN [1] 

RP SEQUENCE FROM N.'A. 

RC STRAIN-BRISTOL N2; 

RX MEDLZNE-94 150718 ; PubMed-7906398; 

RA Wilson R., Ains'cough R., Anderson K. , Baynes C, Berks M., 

RA Bonfield J., Burton J., Connell M,, Copsey T,, Cooper J., Coulson A., 

RA Craxton M., Dear S,, Du Z,, Durbin R ( , Favello A., Fulton L,, 

RA Gardner A., Green P., Hawkins T., Hillier L., Jier M, , Johnston L., 

RA Jones M., Kershaw J., Kirsten J., Laister N., Latreille P., 

RA Lightning J., Lloyd C, Mcmurray A,, Mortimore B., O'Callaghan M., 

RA Parsons J., Percy C, Rifken L., Roopra A,, Saunders D,, Shownkeen R, , 

RA Smaldon N., Smith A., Sonnhammer E., Staden R., Sulston J,, 

RA Thierry-Mieg J.;, Thomas K., Vaudin M., Vaughan R., Waterston R. , 

RA Watson A., Weinstock L., Wilkinson-Sproat J., Wohldman P.; 

RT "2.2 Mb of contiguous nucleotide sequence from chromosome III of C, 

RT elegans,"; 

RL Nature 368:32-33(1994). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL N2; 

RA Nhan M., Hawkins J,; 

RL Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL N2; 

RA Waterston R.; •■■ 

RL Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL N2; 

RA Waterston R.; \ 

RL Submitted (APR-1997) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; U88183; AAB52658.1; -. 

DR HSSP; P56276; 1ILK. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00047; ig; 4. 

SQ SEQUENCE 423 AA; 46544 MW; EB4530DB6BD575E5 CRC64; 



Query Match - 9.8%; Score 852; DB 5; Length 423; 

Best Local Similarity 46.1%; Pred. No. 4.7e-49; 

Matches 177;. Conservative 55; Mismatches 144; Indels 8; Gaps 

Qy 68 PRIVEHPSDLIVSRGEPATLNCKAEGRPTPTIEWYKGGERVETDKDDPRSHRMLLPSGSL 127 

I Mil l::IM llllll I: I I III I: I |:|: lll::| :||| 
Db 30 PVIIEHPIDVWSRGSPATLNCGAR-PSTAKITWYRDGQPVITNREQVNSHRIVLDTGSL 88 

Qy 128 FFLRIVHGRRSR-PDEGVYVCVARNYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEP 186 

I :: I: \: III III I II h ll::|:||:||| I I II 
Db 89 FLLKVNSGKNGKDSDAGAYYCVASNEHGEVKSNEGSLKLAMLREDFRVRPRTVQALGGEM 148 

Qy 187 AVMECQPPRGHPEPTISWKRDGSPLDDKD-ERITIRG-GRLMITYTRKSDAGKYVCVGTN 244 

11:11 MIUII : 1 1 : 1 1 I :l II: I 1:1 :||:| I I I 
Db 149 AVLECSPPRGFPEPWSWRRDDKELRIQDMPRYTLHSDGNLIIDPVDRSDSGTYQCVANN 208 

Qy 245 MVGERESEVAELTVLERPSFVKRPSNLAVTVDDSAEFKCEARGDPVPTVRWRKDDGELPK 304 

lllll I 1,1:1 1:1 I : I :: I I : I I III I : |:: : :| 
Db 209 MVGERVSNPARLSVFEKPKFEQEPRDMTVDVGAAVLFDCRVTGDPQPQITWRRKNEPMPV 268 

Qy 305 SR-YEIRDDHTLKIRKVTAGDMGSYTCVAENMVGRAEASATLTVQEPPHFWRPRDQWA 363 

: I :h M :| I I I I I I I |||| I II II I II II I 
Db 269 TRAYIAKDNRGLRIERVQPSDEGEYVCYARNPAGTLEASAHLRVQAPPSFQTKPADQSVP 328 

Qy 364 LGRTVTFQCEATGNPQPAIFWRREGSQNLLF-SYQPPQSSSRFSVSQTGDLTITNVQRSD 422 
I I H: Mllll :M 1:111 II : I II II III M I 
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Db 329 AGGTATFECILVGQPSPAYFWSREGQQDLLFPSY--VSADGRTKVSPTGTLTIEEVRQVD 386 

Qy 423 VGYYICQTLNVAGSIITKAYLEVT 446 

I 1:1 :| III ::H h I 
Db 387 EGAYVCAGMNSAGSSLSKAALKAT 410 



RESULT 13 
P97798 

ID P97798 PRELIMINARY; PRT; 1493 AA. 
AC P97798; 

DT 01-MAY-1997 (TrEMBLrel. 03, Created) 

DT 01-MAY-1997 (TrEMBLrel. 03, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE NEOGENIN (NEOGENIN PROTEIN) . 

GN NEOl. 

OS Mus musculus (Mouse). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostoitii; 
OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
OX NCBIJaxID-10090; 
[1] 

SEQUENCE FROM N.A, 
Wf. MEDLINE-97407661; PubMed-9264410; 
kA Keeling S.L., Gad J.M., Cooper H.M.; 

RT "Mouse Neogenin, a DCC-like molecule, has four splice variants and is 

RT expressed widely in the adult mouse and during embryogenesis . "; 

RL Oncogene 15:691-700(1997). 

DR EMBL; Y09535; CAA70727.1; -. 

DR HSSP; P02751; 1TTF. 

DR MGD; MGI: 1097159; Neol. 

DR INTERPRO; IPR000531; ■. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; £n3; 6. 

DR PFAM; PF00047; ig; 4. 

DR PRINTS; PR00014; FNTYPEIII. 

DR PROSITE; PS00430; TONB_DEPENDENT_REC_l ; ONKNOWNJ. 

1493 AA; 163159 MW; 441DE919D5E17C0E CRC64; 



Query Match * 8.7%; Score 760; DB 11; Length 1493; 

Best Local Similarity 23.2*; Pred, No. 4,7e-42; 

Matches 385; Conservative 220; Mismatches 608; Indels 446; Gaps 

Qy 70 IVEHPSDLIVSKGEPATLNCKAEGRPTPTIEWYRGGERVETDKDDPRSHRMLLPSGSLFF 129 

:H I I : :| III I hi III I I : : II I III Nil 
Db 67 LVE-PVDTLSVRGSSVILNCSAYSEPSPNIEWKKDGTFLNLESDD—RRQLLPDGSLFI 122 

Qy 130 LRIVHGRKSRPDEGVYVCVAR-NYLGEAVSHNASLEVAILRDDFRQNPSDVMVAVGEPAV 188 
fk :ll : ::llll i 1 1 1 : 1 1 1 1 I I I II I I MM: 

Mi 123 SNWHSKHNKPDEGFYQCVATVDNLGTIVSRTAKLTVAGL-PRFTSQPEPSSVYVGNSAI 181 

Qy 189 MECQPPRGHPEPTISWKKDGSPLDDDERITIRGGKLMITYTRKSDAGKYVCVGTNMVGE 248 

: I: I : II \ • > \ hh : I I I I: : 
Db 182 LNCE-VNADLVPFVRWEQNRQPLLLDDRIVKLPSGTLVISNATEGDGGLYRCIVESGGPP 240 

Qy 249 RESEVAELTVLERPS FVKRPSNLAVTVDDSAEFKCEARGDPVPTVRKRKDDGEL 302 

: h III II: I I: III:: II I' III III h: I 
Db 241 RFSDEAELKVLQDPEEIVDLVFLMRPSSMMKVTGQSAVLPCWSGLPAPWRWMKNEEVL 300 

Qy 303 - - -PKSRYEIRDDHTLKIRKVTAGDMGSYTCVAENMVGRAEASATLTVQEPPHFWKPRD 359 

I : hi II I hi hhl II I llll II h :l : 
Db 301 DTESSGRLVLLAGGCLEISDVTEDDAGTYFCIADNGNKTVEAQAELTVQVPPGFLKQPAN 360 

Qy 360 QWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQ 419 

: hll II I I : I : I : | | : : :| : : 

Db 361 IYAHESMDIVFECEVTGKPTPTVKWVKNGDWI PSDNFKIVKEHNLQVLGLV 412 

Qy 420 RSDVGYYICQTLNVAGSIITKAYLEVT-DVIADRPPPVIRQGPVNQTVAVDGTFVLSCV 477 

:|| hi I I I: I I : II II I I 

Db 413 KSDEGFYQCIAENDVGNAQAGAQLIILEHDVAIPTLPP TSLTSATTDHLAP 463 

Qy 478 ATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGE-ATW 536 



II hi: i llll 
464 ATTGPLPSAPRDWASLVST • ■ 



I: II : II I h h 
■ -RFIKL TWRTPASDPHGDNLTY 504 



Qy 537 SAYIEVQEFGyPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQPNLNSGATPTSYII 596 

I : : II : I I I : :|| 

Db 505 SVFYTKE--GVDRERVENT SQPGEMQVT 530 

Qy 597 EAFSHASGSSHQTVAENVKTETSAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQD 655 

I: I I :hl I I I :| I I hll 
Db 531 -, IQNLMPATVYIFKVMAQNKHG-SGESSAPLRVETQP 565 

Qy 657 VLPTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHW-TVDQQSQYIQGYKILYRPSG 715 

: I: I : :|| II I I I : II lh I I 

Db 566 EV ,-QLPGPAPNIRAYATSPT SITVTWETPLSGNGEIQNYKLYYMEKG 611 

Qy 716 ANHGESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAP 775 

: II I : :| | |:| | : : | :: :|| : I 
Db 612 TDR-EQDIDV* SSHSYT INGLKKYT EYSFRWAYNKHG PGVSTQDVAVRTLSDVP 664 

Qy 776 SAPPQGVTVSKNDGNGTAILVSWQPPPEDTQNGMVQEYKVWCLGNETRYHINKT-VDGST 834 

II II ::: > I :|:: llll llll : lh : : :| I h 

Db 665 SAAPQNLSLEVR--NSKSIVIHWQPPSSTTQNGQITGYKIRYRKASRKSDVTETLVTGTQ 722 

Qy 835 FSWIPFLVPGIRYSVEVAASTGAGSGVKSE PQFIQLDAHGNPV 878 

I :| I I I: III I hi :: h : I h 

Db 723 LSQLIEGLDRGTEYNFRVAALTVNGTGPATDWLSAETFESDLDETRVPE-VPSSLHVRPL 781 

Qy 879 SPEDQVSLAQQISDWKQPAFIAGIGAACWI ILMVFS IWLYRHRKKRNGLT 929 

I: : lh I III: : I hi 

Db 782 VTSIWSWTPPENQ NIWRGYAIGYGIGSPHAQTIKVD — YKQR 823 

Qy 930 STYAGIRKV-PSFTFTPTV-TYQRGGEAV — SSGGRP GLLNISEP-AAQPW 975 

I I : II : I: : II : h II lh I I 

824 "YYTIENLDPSSHYVITLKAFNNVGEGIPLYESAVTRPHTDTSEVDLFVINAPYTPVPD 881 



Db 



Qy 976 LADTWPNTG - ■ - - - NNHNDCSISCCTAGNGNSDSNLTTYSRPADCI ANYNNQLDNKQTNL 1030 

I l'. :h I: :h:| : : I : I :|| 

Db 882 PTPMMPPVGVQASILSHDTIRITW ADNSLPKHQKITD- -SRYYTV* -RWKTN- 929 

Qy 1031 MLPESTVYGDVDLSNKINEMKTFNSPN LKDGRFVNPSGQPTPYATTQLIQSN 1082 

:| :l I :'■: : :: : I II : II : II :|: :: 

Db 930 -IPANTKYKNAN-ATTLSYLVTGLKPNTLYEFSVMVTKGRRSSTWSMTAHGATFELVPTS 987 

Qy 1083 LSNNMNNGSGDSGEK — HWKPLGQQKQEV APVQYNIVEQ— NKLN 1123 

:: h; : :|:| : :: | : ::| hi 

Db 988 PPKDVTWSKEGKPRTIIVNWQPPSEANGRITGYIIYYSTDVNAEIHDWVIEPWGNRLT 1047 

Qy 1124 KDYRANDTVPPTIPYNQSYDQNTGG SYNSSDR- -GSSTSGSQGHKKG 1168 

:: : -I I : :h I : 1 1 1 : II I II 

Db 1048 - -HQIQELTLDTPYYFKIQARNSKGMGPMSEAVQFRTPKADSSDKMPNDQALGSAG- -KG 1103 

Qy 1169 ARTPKVPKQGGMNWADLLPPPPAHPPPHSNSE- - -EYNI 1204 

:| I : .- M II II : : h 
Db 1104 SRLPDL- — T--GSDYKPPMSGSNSPHGSPTSPLDSNMLLVIIVSVGVITIVWWIAV 1156 

Qy 1205 SVDESYDQEMPC - PVPPARMYLQQDELEEEEDERG PTP - PVR 1244 

II: h : I I I ::: : II : :: I I II 
Db 1157 FCTRRTTSHQKKKRAACKSVNGSHKYKGNCKDVKPPDLWIHHERLELKPIDKSPDPNPV- 1215 

Qy 1245 GAASSPAAVSYSHQSTATLTPSPQEELQPMLQDCPEETGHMQ HQPD 1290 

III ::: | : : : I : h : 
Db 1216 MTDTPIPRNSQDITP-VDNSMDSNIHQRRNSYRGHESEDSMSTLAG 1260 

Qy 1291 -RRRQPVSPPP-— PPRPISPPHTYGYISGPLVSDMDTDAPEEEEDEADMEVAKMQTRR 1345 

I :l :i 1 1 : 1 : I hill : : 
Db 1261 RRGMRPKMMMPFDSQPPQPVISAH PIHS— LDNPHHHFHSSSL 1301 

Qy 1346 LLLRGLEQTPA SSVGDLESSVTGSMINGWGSASEEDNISSGRSSVSSSDGSFF 1398 

:|| II : :|: h : I II: :|| : 
Db 1302 ASPARSHLYHPSSPWPIGTSM--SLSDRANSTESVRNTPSTDTMPASSSQTCC 1352 

Qy 1399 TDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAAVMQKT 1458 

II : I.:: I :| I :| I lh I 
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1353 TDHQDPEG - ATSSSY — LASSQEED" 



■-SGQSLPTAHV- 



Qy }459 RPARRLRHQPGHLRRETYTDDLPPPPVPP- - -PAIRSPTAQSKTQLE 1502 

II: II III II II: I h II 

Db 1385 RPSHPLK *>- - -SFAVPAIPPPGPPLYDPALPSTPLLSQQALEPSTFHSVKTASIG 1435 

Qy 1503 VRPVWPKLPSMDARTDRSSDRRGSSYKGREV 1534 

Mill I : I I : III: |: 
Db 1436 TLGRSRPPMPVWPSAPEVQ * ETT RMLEDSESS YEPDEl 1473 



RESULT 14 
Q9QY38 

ID Q9QY38 PRELIMINARY; PRT; 1259 AA. 

AC Q9QY38; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

■ NEURAL CELL ADHESION MOLECULE LI. 

^ L1CAM. 

OS Mus muscuks (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; VerteBrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBI_TaxID»10090; 

RN [1] 

RP SEQUENCE FROM N.A, 

RA Platzer M., Brenner V., Reichwald K,, Wiehe T., Oksche A., 

RA Rosenthal A.; 

rt "Comparative sequence analysis of the mouse Llcam locus and the 

RT corresponding region of human Xq28."; i. 

RL Submitted (MAR-1999) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AF133093; AAF22153.1; -. { 

DR HSSP; P20241; 1CFB. i 

DR INTERPRO; IPR001777; -. f 

DR INTERPRO; IPR003006; -. > 

DR PFAM; PF00041; fn3; 4. f 

DR PFAM; PF00047; ig; 6. { 

SQ SEQUENCE 1259 AA; 140916 MW; 25743C039892A22F CR(jS4; 

Query Match 8.7%; Score 758; DB 11; Length 1259; 

Best Local Similarity 23.2%; Pred. No. 4.9e-42; > 

Matches 305; Conservative 175; Mismatches 516; Indels 320; Gaps 49; 

Qy 56 YTGSRLRQEDFPPRIVEH-PSDLIVSKGEPATLNCKAEGRPTPTIEWYKGGERVETDR-- 112 
A I I :: II I I I 1:1 : :| 1:1 III I il I : : 

M 26 YRGHHVLE- - -PPVITEQSPRRLVVFPTDDISLKCEARGRPQVEFRWTKDGIHFRPREEL 82 

^ 113 DDPRSHmPSGSLFFLRIVHGRKSRiDEGVYVCVARNYLGEAVSHNASLEVAI 167 

: I I : : II :|:| I I I llfbll h : 

Db 83 GVWHEAPYSGSFTIEGNNSFAQRF QG I YRCYASNKLGTAMSH — EIQL 129 

Qy 168 LRDDFRQNPSD — VMVAVGEPAVMECQPPRGHPEPTISWRRDGSPLDDRDERITI-RG 222 

: : : I : I I II h I II I I ;|||::: : 

Db 130 VAEGAPKWPKETVKPVEVEEGESVVLPCNPPPSAAPLRIYWMNSRILHIRQDERVSMGQN 189 

Qy 223 G KLMIT YTRKSD-AG KYVCVGTNMVGER - • •ESEVAELTVLERPSFVKRPSNLAVTVDDS 278 

II II hi : I I : I :| I hi I : I 
Db 190 GDLYFANVLTSDNHSDYIC-NAHFPGTRTIIQKEPIDLRVKPTNSMIDRKPRLLFPTNSS 248 



Qy 279 AE— FKCEARGDPVPTVRWRKDDGELPKSRYEIRDDH--TLKIRKVTAGDMG 326 

: :| I I I ih:l II:: I I I 

Db 249 SRLVALQGQSLILECIAEGFPTPTIKWLHPSDPMPTDRV-IYQNHNKTLQLLNVGEEDDG 307 

Qy 327 SYTCVAENMVGRAEASATLTVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRR 386 

llhlll :l I : :||: I::: lh : I I hi III I II 
Db 308 EYTCLAENSLGSARHAYYVTVEAAPYWLQKPQSHLYGPGETARLDCQVQGRPQPEITWRI 367 

Qy 387 EGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITRAYLEVT 446 

I I : :: : I I I ::lll II h I I :: lh I 
Db 368 NG MSMETVNKDQKYRIEQ-GSLILSNVQPSDTMVTQCEARNQHGLLLANAYIYVV 421 

Qy 447 DVIADRPPPVIRQGPVNQT-VAVDG-TFVLSCVATGSPVPTILWRKDGVLVSTQDSRIRQ 504 



Db 



: I :: : III :lhl I I I I hllh: I : III 
422 QL- • ■ -PARILTRD--NQTYMAVEGSTAYLLCRAFGAPVPSVQWLDEEGTTVLQDERFFP 475 

505 LENGVLQIRYARLGDTGRYTCIASTPSGEATWSAYIEVQEFGVPVQPPRPT 555 

II I II : lllll lh II ::hl I II 
476 YANGTLS IRDLQANDTGRY FCQAANDQNNVT I LANLQVKEATQ ITQGPRSAI EKKGARVT 535 



Qy 556 DPNL 559 

Ihl 

Db 536 FTCQASFDPSLQASITWRGDGRDLQERGDSDKYFIEDGRLVIQSLDYSDQGNYSCVASTE 595 

Qy 560 .- r - - IPSAPSKPEVTD — VSRNTVTLSWQPNLNSGATPTSYI IE - AFSH 601 

. I h = l : :: I Ml I : : III 
Db 596 LDEVESRAQLLWGSPGPVPHLELSDRHLLRQSQVHLSWSPAEDHNSPIERYDIEFEDRE 655 

Qy 602 ASGSSWQTVAENVRTETSAIRGLRPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLPTS 661 

: I :: '■■ HI II lllll II :|| :|: I I : I 
Db 656 MAPEKWFSLGKVPGNQTSTTLRLSPYVHYTFRVTAINKYGPGEPSPVSETWTPEAAPEK 715 

Qy 662 QGVDHRQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYKILYRPSGANHGES 721 

II : Mh : : Ml : II h= Ml I : 

Db 716 NPVDVRGEGNETNNMVI TWKPLRW-MDWNAPQIQ - YRVQWRPQGK - - -QE 760 

Qy 722 DWLVFEVRTPARNSVVIPDLRKGVNYEIKARPFFNEFQGADSEIRFARTLEEAPSAPP-- 779 

I I I i :|: : I Nil : |: :| : :: : h I I 
Db 761 TWREQTVSDP- - •FLWSNTSTFVPYEIRVQAVNNQGRGPEPQVTIGYSGEDYPQVSPEL 817 

Qy 780 QGVTVSRNDGNGTAILVSWQPPPEDTQNGMVQEYRV- -WCLGNETRY- - -HINRT - - -VD 831 

: :h I : :|l hi I :: I I I h: Ihh I 
Db 818 EDITIF — NSSTVLVRWRPVDLAQVRGHLKGYNVTYWWRGSQRRHSRRHIHRSHIWP 873 

Qy 832 GSTFSWIPFLVPGIRYSVEVAASTGAGSGVRSE PQFIQLDAHG — 875 

:||:: (I I III I I I I II h : h 

Db 874 ANTTSAILSGLRPYSSYHVEVQAFNGRGLGPASEWTFSTPEGVPGHPEALHLECQSDTSL 933 

Qy 876 NPVSPEDQVSLAQQ I SDWKQPAF IAG IG AACWI I LHVF 914 

:|| I : I :ll 

Db 934 LLHWQPPLSHNGVLTGYLLSYHPVEGESKEQLFFNLSD 971 

Qy 915 SIWLYRHRRKRNGLTSTYAGIRKVPSFTFTPTVTYQRG-GEAV SSGGRPGLLN 966 

::: lh :: : I I hi III: : hi I 
Db 972 PELRTHNLTNLNPDLQ — YRFQLQATTQQGPGEAIVREGGTMALFGRPDFGN 1021 

Qy 967 ISEPAAQPWLADTW-PNTGNNHNDCSISCCTAGNGNSDSNLTTYSRPADCIANYNN — 1021 

II I : : ■ :| I I h ; :: :| ;|| 

Db 1022 ISATAGENYSyVSWVPRKG-— QCNFRFHILFKALPEGKVSPDHQPQPQYVSYNQSSYT 1077 

Qy 1022 — QLDNRQTNLMLPESTVYGDVDLSN RINEMRTFNS 1055 

I I I r :: | : :|: ::; :| | 

Db 1078 QWNLQPDTKYEIHLIKERVLLHHLDVRTNGTGPVRVSTTGSFASEGWFIAFVSAIILLLL 1137 

Qy 1056 PNLRDGRF VNPSGQPTPYATTQLIQSNLSNNMNNGSGDSGEK- 1097 

fl h: h :| I :| hi I I 

Db 1138 ILLILCFIKR5KGGRYSVKDKEDTQVDSEARPMKDETFGEYRSLESDNEEKAFGSSQPSL 1197 

Qy 1098 -HWRPLGQQKQEV APVQYN — IVEQNRLNKDYRA- - -NDTVPPTIPYN 1139 

llll : Ihl : I h I lh M I 

Db 1198 NGDIKPLGSDDSLADYGGSVDVQFNEDGSFIGQYSGKREKEAAGGNDSSGATSPIN 1253 



RESULT 15 
Q90610 
ID 
AC 
DT 
DT 
DT 
DE 



Q90610 PRELIMINARY; PRT; 1443 AA. 
Q90610; 

01-NOV-1996 (TrEMBLrel. 01, Created) 
01-NOV-1996 (TrEMBLrel. 01, Last sequence update) 
01-OCT-20Q0 (TrEMBLrel. 15, Last annotation update) 
NEOGENIN (FRAGMENT). 
Gallus gallus (Chicken) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 
Gallus . 

NCBI_TaxID-9031; 
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[i] 

SEQUENCE FROM N.A. 

STRAIN-WHITE LEGHORN; TISSUE-BRAIN; 

MEDLINE-95105243; PubMed-7806578; 

Vielmetter J., Roman J.M., Dreyer W.J.; 

"Neogenin, an avian cell surface protein expressed during terminal 

neuronal differentiation, is closely related to the human tumor 

suppressor molecule deleted in colorectal cancer,"; 

J. Cell Biol. 127:2009-2020(1994). 

EMBL; 007644; AAC59662.1; 

HSSP; P11276; 2MFN, 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003006; -. 

PFAM; PFQQ041; fn3; 6. 

PFAM; PF00047; ig; 4. 

PRINTS; PR00014; FNTYPEIII. 

NON TER 1 1 

1443 AA; 158050 MW; 558C6795579C0E26 CRC64; 



a Query Match 8.7%; Score 757.5; DB 13; Length 1443 ; 

«est Local Similarity 23.2*; Pred. No. 6.6e-42; 

■Matches 378; Conservative 207; Mismatches 681; Indels 361; Gaps 64; 

Qy 57 TGSRLRQEDFPP-RIVEHPSDLIVSRGEPATLNCRAEGRPTPTIEWYRGGERVETDRDDP 115 

MM II : I I:: :| :|| : I III I I : II 
Db 9 TGSWR--TFTPFYFLVEPMDILSVRGASVIMNCSSYCETPPKIEWKKDGTLLNLVSDD- 65 

Qy 116 RSHRMLLPSGSLFFLRIVHGRKSRPDEGVYVCVAR-NYLGEAVSHNASLEVAILRDDFRQ 174 

I III III :M : Mill I III II II I I II I I 
Db 66 --RRQLLPDGSLLINSWHSKHNKPDEGYYQCVATVESLGSIVSRTAKLTVAGL-PRFTS 122 

Qy 175 NPSDVMVAVGEPAVMECQPPRGHPEPT ISWKKDGSPLDDKDERIT IRGGKLMITYTRKSD 234 

I II I:: I: I : IM II I : I hi :l 

Db 123 QPELSSVYKGNSAILNCE-VNVDLAPFVRWEQDRQPLSLDDRVFKLPSGALLIGNATDTD 181 

Qy 235 AGKYVCVGTNMVGERESEVAELTVLERPS FVKRPSNLAVTVDDSAEFRCEARGD 288 

MM: : || IN :| I MM! : I I I I I 

Db 182 GGFYRCVIESGGTPKYSEEAELKILPDPEEPQSLVFVRQPSSLTKVTGQNAVFPCVAGGF 241 

Qy 289 PVPTVRWRKDDGEL---PKSRYEIRDDHTLKIIiLvTAGDMGSYTCVAENMVGKAEASATL 345 

I I III I: II I: :l :| I II l:|:llh|:| II I I 
Db 242 PTPYVRWTKNGEELITEDSERFALRAGGSLLISDVTEEDVGTYTCIADNENETIEAQAEL 301 

Qy 346 TVQEPPHFWKPRDQWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRF 405 

II II I: :| : : hll II I I : I : I : I I 

,Db 302 AVQVPPEFLKRPANIYAHESMDIVFECEVTGKPTPTVKWVKNGDWIPSDY F 353 

Qy 406 SVSQTGDLTITNVQRSDVGYYICQTLNVAGSIITKAYLEVTDVIADRP—PPVIRQGPVN 463 
4 : : :| : : :|| |:|| | |: | | : |: II | 

■ 354 KIVKEHNLQVLGLVKSDEGFYQCIAENDVGNAQAGAQLIILDLDVAIPTLPPTSLTSATN 413 

Qy 464 QTVAVDGTFVLSCVATGSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRY 523 

:| II IM IHI |: :| 
Db 414 DHLA PATTGPLPTAPRDWATLVST RFIRL 443 

Qy 524 TCIASTPSGEATWSAYIEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLSWQP 583 

II II I: II I I ::| I : :| 

Db 444 TWR TPVSDPQ--GDNLTYSIFYTKE-GINRERVENTSRP 479 

Qy 584 NLNSGATPTSYIIEAFSHASGSSWQTVAENVKTETSA-IKGLKPNAIYLFLVRAANAYGI 642 

II I: II :|:l MM 
Db , 480 G ETQVMIQNLMPETVYVFRWAQNRHGH 507 

Qy 643 SDPSQISDPVKTQDVLPTSQGVDHKQVQRELGNAVLHLHNPTVLSSSSIEVHW-TVDQQS 701 

M 1:1 Mil: I : :ll IM I I : 
Db 508 GESSA- • -PLK- - - -VATQPEV- - -QLPGPAPNIRAYAGSPT SVTVTWETPLSGN 552 

Qy 702 QYIQGYKILYRPSGANHGESDWLVFEVRTPAKNSWIPDLRKGVNYEIRARPFFNEFQGA 761 

II Ih I I M I I IIIMI I : I 
Db 553 GEIQNYKLYYMEKGQD ■ SEQDVDV AGLSYT ITGLRRYTEYSFRWAYNKHGPGV 605 

Qy 762 DSEIKFARTLEEAPSAPPQGVTVSRNDGNGTAILVSWQPPPEDTQNGMVQEYRVCLGNE 821 



:: :IM III II :|: I :|:: Mill I :| : Ih 
Db 606 STQDVWRTLSDVPSAAPQNLTLEAR--NSKSIMLHWQPPPAGTHSGQITGYKIRYRKVS 663 

Qy 822 TRYHINKTVDGSTFSWIPFLVPGIRYSVEVAASTGAGSG VRSEPQFIQLD — 872 

: : M |: :| I I |: :|| I |:| I :| II 
Db 664 RRSDVTESVGGTQLFQLIEGLERGTEYNFRIAAMTVNGTGPATDWVSAETFESDLDESRV 723 

Qy 873 AHGNPV SPEDQVSLAQQISDWKQPAFIAGIGAACWIILMVFSIW 917 

I I: 1 1 : r : II: I III: : I 
Db 724 PEVPSSLHVRPLVTSIWSWTPPENQ NIWRGYAIGYGIGSPHAQTIRVD— 773 

Qy 918 LYRHRKKRNGLTSTYAGIRKV-PSFTFTPTV-TYQRGGEAV SSGGRP GL 964 

IM • II Ml : h : II : I: II I 
Db 774 -YRQR YYTIENLDPSSHYVITLRAFNNVGEGIPLYESAVTRPHSDTSEVDL 823 

Qy 965 LNISEP-AAQPWLADTWPNTG NNHNDCSISCCTAGNGNSDSNLTTYSRPADCIAN 1018 

I: I i : I I :h I: MM : I I 

Db 824 FVINAPYTPVPDPSPMMPPVGVQASILSHDTIRITW ADNSLPKNQKITD-AR 874 

Qy 1019 YNNQLDNRQTNLMLPESTVYGDVDLSN RINEMKTFNSPNLRDGRFVNPSGQP 1070 

I :|l =1 :| I : : Ml: | | : : 

Db 875 YYTV- -RWRTN- -IPANTRYKTANATTLSYLVTGLRPNTLYEFSVMVTKGRR- • SSTWSM 928 

Qy 1071 TPYATT-QLIQSNLSNNMNNGSGDSGER— -HWKPLGQQRQEVAP-VQYNIVEQNRLNK 1124 

| : || :|: :: :: | : : :|:| : :: :| M 
Db 929 TAHGTTFELVPTSPPRDVTWSREGRPRTIIVNWQPPSEANGRITGYIIYYSTDVNAEIH 988 

Qy 1125 DYRANDTVPPTIPYN-QSYDQNTGGSYNSSDRGSSTSGSQGHRRGARTPKVPRQGGMNWA 1183 

I: | : : | :| : | | I Ml : 
Db 989 DWVIEPWGNKLTHQIQELTLDTPYYFRIQARNSRGMGPMSEAVQFRTPRAES S 1042 

Qy 1184 DLLPPP PAHPPPHSNSEEYNISVDESYDQEMPCP 1217 

Ml : I Ml I I M II 

Db 1043 DKMPNDQASGSAGKGSRPVDVGPDYKPPLSGSNSPHGSPTSPLDSNMLLVIIVSVGVni 1102 

Qy 1218 -VPPARMYLQQDELEEEEDERGPTPPVRGAASSPAAVSYSHQSTATLTPSPQEELQPML 1275 

I : " :| II h :: : I : 

Db 1103 VIWIVAVFCTRRTTSHQKRRRAACRSVNG SHRYRGNSRDVRPPDLWI — 1150 

Qy 1276 QDCPEETGHMQHQPDRRRQPVSPPPPPRPI— SPPHTYGYISGPLVSDMDTDAPE-— 1328 

I :|: I I II :l I: : IM : 

Db 1151 HHERLELRPIDRSPDPNPIMTDTPIPRNSQDITPVDNSMDSNIHQRRNS 1199 

Qy 1329 EEEDEADMEVARMQTRRLLLRGLEQTPASSV GDLES" 1364 

I IM : I :: : I I III 
Db 1200 YRGHESEDSMSTLAGRRGMRPRMMMPFDSQPPQPVISAHPIHSLDNPHHHFHSGSLASPT 1259 

Qy 1365 -SVTGSMINGW- •GSASEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAAE- ■ • YAGLRV 1418 

I :: I I:: : :: II :: I Ml M :M 

Db 1260 RSYLHHQVSPWPVGTSMSHSDRANSTESVRNTPSSDTMPASSSQPCADHQDPDSSSGAYL 1319 

Qy 1419 ARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAAVMQRTRPAKRLRHQPGHLRRETYTD 1478 

I Ml ; :: I :: M I I 

Db 1320 GSAQEEDAA QSLPTAHVRPSHPLRSFAVPAVPAAGSAYDP 1359 

Qy 1479 DLPPPPV- - PPPAIKSPTAQSRTQLEVR ■ • ■ PVWPRLPSMDARTDRSSDRKGS 1527 

II I: ' I ::l II I I Mill M I M I 

Db 1360 TLPSTPLLTQQAPSHPVHSVR • ■ TASIGTLGRTRPPMPVWPSAPDVQ • ETTRMLEDSES 1416 

Qy 1528 SYRGREV 1534 

II: I: 
Db 1417 SYEPDEL 1423 



Search completed: January 22, 2001, 12:54:03 
Job time: 2044 sec ; 
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> Best Available Copy 

Mon Jan 22 13:04:44 2001 us-09-540-245a-19.rag Page 1 



GenCore version 4.5 
Copyright (c),1993 • 2000 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: January 22, 206l, 12:19:37 ; Search time 233.01 Seconds 

] (without alignments) 

I 63.689 Million cell updates/sec 

US-09-540-245A-19 
2280 

1 QI^QGRTVTFPCETKGNPQ TSAALSQSQRPRPTKKHKGG 434 

BLOSOM62 
Gapop 10.0 



Title: 

Perfect score: 



Scoring table: 

, Gapext 0.5 

^rched: 268485 seqs, 34193795 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 100* 
Listing first 45 summaries 



A_Geneseq_36: 

1; 
2: 
3: 
4 
5: 
6; 
7: 
8: 
9: 
10 



268485 



DAT 

/slDSl/gcgdata/geneseq/geneseqp/AA1981 . DAT ; 
/slDSl/gcgdata/geneseq/geneseqp/AAl982 . DAT 
/SIDSl/gcgdata/geneseq/genesegp/AA1983.DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AA1984 . DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AAl985 . DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AAl986 . DAT 
/S IDSl/gcgda ta/gen eseq/g eneseqp/AAl 9 8 7 . DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AA1988 . DAT 
/S IDS 1/gcgdata/genes eq/gerieseqp/AAl 9 8 9 . DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AA1990 . DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AA1991.DAT 
/S IDS 1/gcgdata/genes eq/ gen eseqp/AAl 992 . DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AAl993 , DAT 
/SlDSl/gcgdata/geneseq/geneseqp/AA1994 . DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AA1995.DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AAl996.DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AAl997.DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AAl998.DAT 
/SIDSl/gcgdata/geneseq/geneseqp/AAl999.DAT 
/SlDSl/gcgdata/geneseq/geneseqp/AA2Q0Q . DAT 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



Result Query 

No. Score Match Length DB ID 



Description 



1 


2276 


99.8 


434 


20 


Y13567 


2 


2276 


99.8 


434 


20 


Y08405 


3 


914 


40.1 


1649 


20 


Y08404 


4 


913 


40.0 


1651 


20 


Y13566 


5 


877.5 


38.5 


753 


20 


W83927 


6 


561.5 


24.6 


1297 


20 


Y13565 


7 


561.5 


24.6 


1297 


20 


Y08403 


8 


545 


23.9 


1395 


20 


Y13563 


9 


545 


23.9 


1395 


20 


Y08401 


10 


473.5 


20.8 


1380 


20 


Y08402 


11 


473.5 


20.8 


1381 


20 


Y13564 


12 


324,5 


14.2 


1911 


16 


R71726 



Human Robo 2 polyp 
Human partial ROBO 
Human ROBOl protei 
Human Robo 1 polyp 
Human T85 protein. 
C. elegans Robo po 
C. elegans ROBO pr 
Drosophila Robo 1 
Drosophila sp. ROB 
Drosophila sp. ROB 
Drosophila Robo 2 
Human PTP-OB. Horn 



13 


324.5 14 


.2 1911 


18 


W27225 


Human protein tyro 


14 


324.5 U 


,2 1911 


20 


W94027 


Human protein tyro 


15 


323.5 U 


.2 1501 


16 


R72858 


Rat receptor type- 


16 


308.5 13 


.5 1897 


21 


Y81785 


Human protein tyro 


17 


308.5 13 


.5 1897 


21 


Y56100 


LAR tyrosine phosp 


18 


305.5 13 


.4 761 


17 


R92255 


Neural cell adhesi 


19 


285.5 12 


.5 1192 


19 


W57900 


Protein of clone C 


20 


285.5 12 


.5 1299 


21 


Y40439 


Human Nr*CAM prote 


21 


285.5 12 


.5. 1496 


20 


W81030 


Melanoma associate 


22 


285.5 11 


.5- 1496 


21 


Y70469 


Human p53 target m 


23 


276.5 12 


.1 582 


17 


R92256 


Neural cell adhesi 


24 


268 12 


.8 1257 


20 


W74152 


Human LI cell adhe 


25 


267.5 12 


.7 1028 


19 


W29667 


Homo sapiens DL185 


26 


264.5 12 


.6 1242 


19 


W52287 


Rattus norvegicus 


27 


262.5 12 


.5 1304 


19 


W59994 


Human neural cell 


28 


254 12 


,1 868 


17 


R92717 


Mouse muscle- local 


29 


253 12 


.1 869 


18 


W26611 


Human muscle* speci 


30 


253 12 


.1 869 


18 


W26506 


Human Dmk receptor 


31 


250.5 12 


.0 867 


19 


W62583 


Mouse receptor tyr 


32 


250,5 12 


.0 871 


17 


R84087 


Nsk2 receptor. Mu 


33 


250.5 12 


.0 871 


19 


W62568 


Mouse receptor tyr 


34 


250.5 12 


.0 881 


17 


R84091 


Nsk2 receptor with 


35 


250.5 12 


.0 881 


19 


W62572 


Mouse Nsk2 (altern 


36 


250 11 


.'0 1091 


18 


W41641 


Sequence used in d 


37 


250 12 


.0 1091 


20 


IUOU1U 


Mouse LiG'l protei 


38 


249 10 


860 


17 


R92716 


Mouse muscle- local 


39 


249 10 


.9 1447 


16 


R68553 


Deleted in colorec 


40 


249 10 


.9 1447 


20 


Y33498 


Human DCC protein. 


41 


249 " 10 


.9 1728 


12 


R13144 


Deleted in Colorec 


42 


247.5 10 


.9 1018 


15 


R63759 


Human contact in (E 


43 


247.5 10 


.9 1018 


17 


R87028 


Human contactin. 


44 


245.5 10 


.8 863 


17 


R84088 


Nsk2 receptor with 


45 


244.5 10 


.7 863 


19 


W62569 


Alternatively spli 



RESULT 1 
Y13567 

ID Y13567 standard; Protein; 434 AA. 
XX 

AC Y13567; 
XX 

DT 30-JOL-1999 (first entry) 
XX 

DE Human Robo 2 polypeptde. 
XX 
KW 
RW 
XX 
OS 
XX 

FH Key Location/Qualifiers 
FT Misc-difference 285 
FT , /label- unknown 

ft /note- "encoded by G5 

FT Misc-difference 396 



Coram polypeptide; Robo polypeptide; commissureless; roundabout; 
modulation; nerve cell function, 



FT 
FT 
XX 

PN W09925833-A1. 
XX 

PD 27-MAY-1999. 

PF 13-NOV-1998; 
XX 

PR 14-NOV-1997; 



/label= unknown 
/note- "encoded b 



98WO-OS24327. 
97US-0065543. 



PA (REGC ) IJNIV CALIFORNIA. 
XX 

PI Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 



Best Available Copy 

Mon Jan 22 13:04:44 2001 



us-09-540-245a-19.rag 



Page 2 



DR WPI; 1999-338008/28. 

DR N-PSDB; X55771 . 
XX 

PT Modulation of Robo-Comm polypeptide interactions 
XX 

PS Disclosure; Page 49-50; 56pp; English. 
XX 

CC The invention relates to a method for modulating the amount of Com 

CC (commissureless) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Comm polypeptide in contact with the cell/ where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Com in contact with the cell. 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate RoboiComm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function, 

xx I 

SQ Sequence 434 AA; j 



|Query Match * 99.8*,'! Score 2276; DB 20; Length 434; 

|Best Local Similarity 100.0%; Pred. No. 4 .le-144; £ 

Conservative 10; Mismatches 0; Indels 0; 



IIIIIIIIIIIMIIIIMIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 



1 1 1 1 1 1 1 1 1 1 M 1 1 1 S I ! 1 1 1 [ 1 1 1 1 1 1 1 ! I i 1 1 ! I E I M !! 1 1 1 1 1,! 1 1 1 1 1 M 1 1 1 M 



Matches 


oy 

Db 


1 
1 


Oy 


61 


Db 


61 


oy 


121 


Db 


121 


oy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db' 


301 


1 


361 




361 


Qy 


421 


Db 


421 



IIIIIMllllllllllllllllllllllllllllllllll'IIIIIIIIIIIMIIIIII 



iMimiiiimiiimiiiiiiiiiiiiiiiiiiiiiiiiiiMimiimiii 



iiiimiiiiimiiiiiiiiiiiiiiiimiiiiiiiiiiiiiiiiiiiimiii 



IIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 



minium! 



RESULT 2 
Y08405 

ID Y08405 standard; Protein; 434 AA. 

XX 

AC Y08405; 
XX 

DT 24-JUL-1999 (first entry) 
XX 



DE Human partial ROB02 protein. 
XX 

KW ROBOl; ROB02; roundabout; nerve guidance; 

kw cell morphology; screening assay, 



human; murine; cell function; 



Homo sapiens. 



WO9920764-A1. , 
29-APR-1999. 

20-OCT-1998; 98WO-US22164, 

14-NOV-1997; 97OS-0971172. 
20-OCT-1997; 97US-0Q6292L 

(REGC ) UNIV CALIFORNIA, 

Goodman CS, Kidd T, Mitchell RJ, Tear G; 

WPI; 1999-312615/26. 
N-PSDB; X57254. 



Robo polypeptides, a new immunoglobulin superfamily member 
Claim 1; Page 72-73; 80pp; English. 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C. elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides . The 
probes and primers are also useful in screening assays. 



SQ Sequence 434 AA; 



Query Match 99.8%; 
Best Local Similarity 100.0*; 

Conservative I 



Matches 


Db 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 



Score 2276; DB 20; 
Pred. No. 4. le-144; 
i; Mismatches 0; 



Length 434; 
Indels 0; 



llllllllllllllllllllllllllll.llllllllllllllllllllllllllllllll 



IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIMMIIIIIIIIIII 



llllllllllllllllllllllllllllllimillilimilllllllMIIIIIII 



IMIIIIIIIIMIIIIMIIIIIIIIIIIIIIIIIIIIIIIIillllMIIIIIIIMI 



IllllllllllllillllllilllllllllllllUIIIIIIIIIIIIIIIIUIIII 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 



lllllllllllll! 



RESULT 3 
Y08404 

ID Y08404 standard; Protein; 1649 AA. 
XX 

AC Y08404; 



; Best Available Copy 

MonJan 22 13:04:44 2001 us-09-540-245a-19 .rag Page 3 



24-JUL-1999 (first entry) 
Human R0B01 protein. 

R0BO1; ROB02 ; roundabout; nerve guidance; human; murine; celi function; 
cell morphology; screening assay, 

Homo sapiens. 

WO9920764-A1. 



DT 
XX 
DE 
XX 
KW 
KW 
XX 
OS 
XX 
PN 
XX 

PD 29-APR-1999. 
XX 

PF 20-OCT-1998; 98WO-US22164. 



XX 

PI 

XX 
DR 



14-NOV-1997; 
20-OCM997; 



97US-0971172. 
97DS-0062921 1 . 



(REGC ) UNIV CALIFORNIA. ■ 

Goodman CS, Kidd I, Mitchell KJ, Tear G; 



WPI; 1999-312615/26. 
N-PSDB; X08404. 



Robo polypeptides, a new immunoglobulin super family member 
Claim 1; Page 65J71; 80pp; English. 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C. elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and frapents are useful 
as probes and primers and for production of the Robo polypeptides, The 
probes and primers are also useful in screening assays, 

Sequence 1649 AA; I 



Query Match 40.1%;i Score 914; DB 20; Length 1649; 

Best Local similarity 23.2%;, Pred, No, 9e-53; 

Matches 258; Conservative :54; Mismatches 118; Indels 682; Gaps 12; 

1 qivaqgrtwfpcetkgnpqpaVfwqkegsqnllfpnqpqqpnsrcsvsptgdltitniq 60 
hll llllll II llllli;:ll::llllllll II I :ll III lllllllhl 
360 qvvalgrtvtfqceatgnpqpaifwrregsqnllfsyqppqsssrfsvsqtgdltitnvq 419 

I 

Qy 61 RSDAGYnCQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCRAT 120 

III llllll I lllll: II lllllh MM!:! Ill llhlllll :| I II 
Db 420 rsdvgyyicqtlnvagsiitkaylevtdviadrpppvirqgpvnqtvavdgtfvlscvat 479 

Qy 'l21 GDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 

I hi I I IH> :| I II III: :: III llhl:: 1 1 1 1 : 1 M 
Db 480 gspvptilwrkdgvlvstqdsrikqlengvlqiryaklgdtgrytciastpsgeatwsay 539 

Qy 181 LDVTESGATIS--KNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEAF 238 

"III : : I : :| 1 1 1 1 : 1 1 1 1 1 : 1 1 III 1 1 ^llllll 
Db 540 ievqefgvpvqpprptdpnlipsapskpevtdvsrntvtlswqpnlnsgatptsyiieaf 599 

Qy 239 SQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAIN 277 

I : :llllll :lll ::lhll lllhlll I 
Db 600 shasgsswqtvaenvktetsaikglkpnaiylflvraanaygisdpsqisdpvktqdvlp 659 

Qy 278 277 

Db 660 tsqgvdhkqvqrelgnavlhlhnptvlssssievhwtvdqqsqyiqgykilyrpsganhg 719 

Qy 278 PK 279 

I 

Db 720 esdwlvfevrtpaknsvvipdlrkgvnyeikarpffnefqgadseikfaktleeapsapp 779 



Qy 280 VSVTQXR— -.- 2 

II I 

Db 780 qgvtvskndgngtailvswqpppedtqngmvqeykvwclgnetryhinktvdgstfsvvi 8 

Qy 287 PQ 2 

II 

Db 840 pflvpgirysvevaastgagsgvksepqfiqldahgnpvspedqvslaqqisdvvkqpaf 8 



Qy 289 ■ 



I II 



Db 900 iagigaacwiilmvfsiwlyrhrkkrngltstyagirkvpsftftptvtyqrggeavssg 959 



Qy 293 ■ 



■-STWAN-- 

II I 



■ 297 



Db 960 grpgllnisepaaqpwladtwpntgnnhndcsiscctagngnsdsnlttysrpadciany 1019 

Qy 298 : 297 

Db 1020 nnqldnkqtnlralpestvygdvdlsnkinemktfnspnlkdgrfvnpsgqptpyattiqs 1079 

Qy 298 - 297 

Db 1080 nlsnnmnngsgdsgekhwkplgqqkqevapvqyniveqnklnkdyrandtvpptipynqs 1139 

Qy 298 VP LPPPPVQPLPGTELEH 315 

II lllll I I : I 

Db 1140 ydqntggsynssdrgsstsgsqghkkgartpkvpkqggmnwadllppppahppphsnsee 1199 

Qy 316 YAVEQQENGYDSDSWCPPLPVQTYLHQGLEDEL-EEDDDRVPTPPVRGVASSP-AISFGQ 373 

I : I: II : II I : II I III 1 1 : 1 : 1 1 1 1 1 1 1 1 III! hh 
Db 1200 ynisvdes-ydqempcpvpparmylqq---deleeeedergptppvrgaasspaavsysh 1255 

Qy 374 QSTATLTPSPREEMQPMLQASP 395 

iiiimmmmm i 

Db 1256 qstatltpspqeelqpmlqdcpeetghmqhqpdrrrqpvsppppprpispphtygyisgp 1315 

Qy 396 : 395 

Db 1316 lvsdmdtdapeeeedeadmevakmqtrrlllrgleqtpassvgdlessvtgsmingwgsa 1375 

Qy 396 ; XFTSSQR 402 

I ill 

Db 1376 seednissgrssvsssdgsfftdadfaqavaaaaeyaglkvarrqmqdaagrrhfhasqc 1435 

Qy 403 PRPTSPFSTDSNTSAALSQSQRPRPTKKHKGG 434 

llllll lllll III: I II II: I 
Db 1436 prptspvstdsnmsaavmqktrpakklkhqpg 1467 



RESULT 4 
Y13566 

ID Y13566 standard; Protein; 1651 AA. 

AC Y13566; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE Human Robo 1 polypeptde. 
XX 

KW Coram polypeptide; Robo polypeptide; commissureless; roundabout; 
KW modulation; nerve cell function. 
XX 
OS 
XX 



Homo sapiens. 
W09925833-A1. 



XX 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; 98WO-DS24327. 
XX 

PR 14-NOV-1997; 97US-0065543 . 
XX 

PA (REGC ) UNIV CALIFORNIA. 
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Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 

DPI; 1999-338008/28. 
N-PSDB; X55770. 

Modulation of Robo-Comm polypeptide interactions 

Disclosure; Page 44-48; 56pp; English. 

The invention relates to a method for modulating the amount of Comm 
(commissureless) polypeptide in contact with a cell expressing active 
Robo (roundabout) on its surface. The method comprises modulating the 
effective amount of Comm polypeptide in contact with the cell, where the 
amount of expressed active ^Robo is specifically modulated inversely with 
the modulation of the effective amount of Comm in contact with the cell. 
The method is used to moduljate the amount of active Robo expressed on a 
cell. The method can be used to screen for agents that modulate Robo: Comm 
interactions . This is particularly useful for modulating nerve cell 



Sequence 1651 AA; 



Query Match 40.0%;* Score 913; DB 20; Length 1651; 

Best Local Similarity 23.2%^ Pred. No. l.le-52; 

Matches 258; Conservative |54; Mismatches 118; Indels 684; Gaps 12; 

3y l qivaqgrtvtfpcetkgnpqpjIvfwqkegsqnllfpnqpqqpnsrcsvsptgdltitniq 60 

hll llllll II lllll(:||::|||||||| II I :|l III IIIIIIM 
Db 360 qvvalgrtvtfqceatgnpqpaifwrregsqnllfsyqppqsssrfsvsqtgdltitnvq 419 

3y 61 rsdagyyicqaltvagsilaiJqlbvtdvltdrpppiilqgpanqtlavdgtallkckat 120 

III llllll I lllll: If llllll: llllhl III i 1 1 : 1 ! I M =1 I II 
Db 420 rsdvgyyicqtlnvagsiitkajylevtdviadrpppvirqgpvnqtvavdgtfvlscvat 479 

i 

121 GDPLPVISWLKEGFTFPGRDPMTIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 

I 1:1 I I hi :| II III: :: III lll:|:: lllhlll 
480 gspvptilwrkdgvlvstqdsijikqlengvlqiryaklgdtgrytciastpsgeatwsay 539 

181 LDVTESGATIS--KNYDLSDL^GPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEAP 238 

::IM : : I : :{ III MM I ::hll MM : : ! 1 1 1 1 1 

5 4 0 ievqef gvpvqpprptdpnlipsapskpevtdvsrntvtlswqpnlnsgatptsyiieaf 599 

239 SQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAIN 277 

I : : 1 1 1 f I f MM i:||:|| lllhlll I 
600 shasgsswqtvaenvktetsafikglkpnaiylflvraanaygisdpsqisdpvktqdvlp 659 



278 ■ 



■ 277 



660 tsqgvdhkqvqrelgnavlhlhnptvlssssievhwtvdqqsqyiqgykilyrpsganhg 719 



Oy 278 ■ 



-PK 279 



720 esdwlvfevrtpaknswipdlrkgvnyeikarpffnefqgadseikfaktleeapsapp 779 

280 VSVTQXK 286 

II I 

780 qgvtvskndgngtailvswqpppedtqngmvqeykvwclgnetryhinktvdgstfswi 839 



287 ■ 



*840 pflvpgirysvevaastgagsgvksepqfiqldahgnpvspedqvslaqqisdwkqpaf 899 

289 KNNG 292 

I II 

900 iagigaacwiilmvfsiwlyrhrkkrngltstyagirkvpsftftptvtyqrggeavssg 959 

293 STWAN 297 

II I 

960 grpgllnisepaaqpwladtwpntgnnhndcsiscctagngnsdsnlttysrpadciany 1019 
298 297 



Db 1020 nnqldnkqtnlmlpestvygdvdlsnkinemktfnspnlkdgrfvnpsgqptpyattqli 1079 

Qy 298 - - 1 - 297 

Db 1080 qsnlsnnmnngsgdsgekhwkplgqqkqevapvqyniveqnklnkdyrandtvpptipyn 1139 

Qy 298 ■-. VP LPPPPVQPLPGTEL 313 

II . Mill I I : 
Db 1140 qsydqntggsynssdrgsstsgsqghkkgartpkvpkqggmnwadllppppahppphsns 1199 

Qy 314 EHYAVEQQENGYDSDSWCPPLPVQTYLHQGLEDEL-EEDDDRVPTPPVRGVASSP-AISF 371 

II: I: II : II I : II I III MM; lllllll MM |:|: 
Db 1200 eeynisvdes-ydqempcpvpparmylqq---deleeeedergptppvrgaasspaavsy 1255 

Qy 372 GQQSTATLTPI3PREEMQPMLQASP 395 

imiimmmnii i 

Db 1256 shqstatltpspqeelqpmlqdcpeetghmqhqpdrrrqpvsppppprpispphtygyis 1315 

Qy 396 - 395 

Db 1316 gplvsdmdtdapeeeedeadmevakmqtrrlllrgleqtpassvgdlessvtgsmingwg 1375 

Qy 396 - XFTSS 400 

I :| 

Db 1376 saseednissgrssvsssdgsfftdadfaqavaaaaeyaglkvarrqmqdaagrrhfhas 1435 

Qy 401 QRPRPTSPFSTDSNTSAALSQSQRPRPTKKHKGG 434 

I llllll lllll III: I II II: I 
Db 1436 qcprptspvstdsnmsaavmqktrpakklkhqpg 1469 



RESULT 5 
W83927 

ID W83927 standard; Protein; 753 AA. 

AC W83927; 
XX 

DT 01-MAR-1999 (first entry) 
XX 

DE Human T85 protein, 
XX 

KW T85; FHMB-6D4; FMHV-SD4; human; neurological disorder; therapy; 
KW 
XX 
OS 
XX 

FH Key 



FT Peptide 

FT 

FT Protein 
FT 

FT Region 

FT 
FT 

FT Region 

FT 

FT 

FT Region 
FT 

FT Region 
FT 

FT Region 

FT 

FT Region 

FT 

FT Region 
FT 

FT Peptide 

FT 

FT Domain 

FT 

FT 



■Location/Qualifiers 
1..2Q 
/label- Sig_peptide 
21. .753 

/label- Mat_protein 
■525.. 610 

'/note- "has homology to a fibronectin type III 
domain" 

638. .727 

/note- "has homology to a fibronectin type III 
domain" 

43. .101 

/note- "has homology to a Ig superfamily domain" 
145.. 203 

/note- "has homology to a Ig superfamily domain" 
237.. 298 

/note- "has homology to a Ig superfamily domain" 
329.. 394 

'/note- "has homology to a Ig superfamily domain" 
.433.. 491 

/note- "has homology to a Ig superfamily domain" 
247.. 249 

/note- "RGD motif" 
516.. 600 

/note- "cytokine receptor homology N-terminal 
domain" 



Best Available Copy 
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XX 

PN WO9848051-A2. 
XX 

PD 29-OCT-1998. 
XX 

PF 17-APR-1998; 98WO-0SO7714 . 
XX 

PR 10-OCT-1997; 97DS-0062017 . 

PR 18-APR-1997; 97DS-0044746 . 
XX 

PA (MILL-) MILLENNIUM BIOTHERAPEUTICS INC. 

xx : 

PI Holtzman D, McCarthy SA; 
XX 

DR WPI; 1999-024021/02. 

DR N-PSDB; V69278, 1 

XX ; 

•T Hew isolated human FTHMA-070 and T85 proteins - used to develop 

T products for the diagnosis and therapy of disorders involving 

T cellular processes, e.g. neuronal development, 
xx 

PS Claim 31; Fig 3; 127pp; English. 
XX 

CC This is the amino acid sequence of a novel human protein designated 

CC T85, and also referred to as FMHB-6D4 and FMHB-SD4. T85 cDNA (see 

CC V69278) was identified in a human foetal brain cDNA library using a 

CC screen designed to identify genes encoding proteins having a 

CC functional signal sequence/ T85 nucleic acids and polypeptides of 

CC the invention are useful as modulating agents in regulating a 

CC variety of cellular processes. They can be used for identifying 

CC compounds which bind to or modulate the activity of the polypeptides 

CC (claimed). They can also be used in screening assays, detection 

CC assays (e.g. chromosomal mapping, tissue typing, forensic biology), 

CC predictive medicine (e.g. diagnostic assays, prognostic assays, 

CC monitoring clinical trials,' and pharmacogenomics), and methods of 

CC treatment (e.g. therapeutic and prophylactic) .e.g. for neurological 

CC disorders. 
XX 

SQ Sequence 753 AA; 



Query Match 38,5%; Score 877.5; DB 20; Length 753; 

Best Local Similarity 51.11; Pred. No. 9.4e-51; 

Matches 180; Conservative 48; Mismatches 93; Indels 31; Gaps 6; 

IQy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 
I - Ml Mllll || ||||l|:||::llllllll II I :l! IN MINIM 
Db 324 qwalgrtvtfqceatgnpqpaifwrregsqnllfsyqppqsssrfsvsqtgdltitnvq 383 

Qy 61 RSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTIiAVIJGTALLKCKAT 120 

Ml Mil I MM II IMIII: MM: III MINI :| I II 
Db 384 rsdvgyyicqtlnvagsiitkaylevtdviadrpppvirqgpvnqtvavdgtfvlscvat 443 

Qy 121 GDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 

I hi I I hi :l I II III: :: III MM: NIM 
Db 444 gspvptilwrkdgvlvstqdsrikqlengvlqiryaklgdtgrytciastpsgeatwsay 503 

Qy 181 LDVTESGAT IS - -KNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEAF 238 

MM: : I : :| 1 1 1 h 1 1 1 1 :: h 1 1 1 1 1 1 1 Mllll! 
Db 504 ievqefgvpvqpprptdpnlipsapskpevtdvsrntvtlswqpnlnsgatptsyiieaf 563 

Qy 239 SQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAIN 277 

I : MM Ml MIM Nihil I 
Db 564 shasgsswqtvaenvktetsaikglkpnaiylflvraanaygisdpsqisdpvktqdvlp 623 

Qy 278 PKVSVTQXKPQKNNGSTWANVPLPPPPVQPLPGTELE-HYAVEQQE— NGY 325 

I : I: I: M I II I : :|h M I! 
Db 624 tsqgvdhkqvqrelgn-avlhlhnptv-lssssievhwtvdqqsqyiqgy 671 



RESULT 6 
Y13565 

ID Y13565 standard; Protein; 1297 AA. 



XX 

AC Y13565; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE C. elegans Robo polypeptde. 

XX 

KW Comm polypeptide; Robo polypeptide; commissureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Caenorhabditis elegans . 

XX 

PN W09925833-A1. : 
XX 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; ' 98WO-US24327. 
XX 

PR ' 14-NOV-1997; -97TJS-0065543 . 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Goodman C, Rid T, Mitchell KJ, Russell C, Tear G; 
XX 

DR WPI; 1999-338008/28. 

DR N-PSDB; X55769 . 
XX 

PT Modulation of Robo-Comm polypeptide interactions 
XX 

PS Disclosure; Page 38-39; 56pp; English. 
XX 

CC The invention relates to a method for modulating the amount of Comm 

CC (commissureless) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Comm polypeptide in contact with the cell, where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Comm in contact with the cell. 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo:Comm 

CC interactions . This is particularly useful for modulating nerve cell 

CC function, 
xx 

SQ Sequence 1297' AA; 



Query Match \ 24.6%; Score 561.5; DB 20; Length 1297; 

Best Local Similarity 37.2%; Pred. No. 1.9e-29; 

Matches 124; Conservative 48; Mismatches 114; Indels 47; Gaps 6; 

Qy 1 Q IVAQGRTVT FPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNS RCSVSPTGDLT I TNIQ 60 

I I II III I I II II III MM : I HIM III :: 
Db 325 qsvpaggtatfectlvgqpspayfwskegqqdllfpsy-vsadgrtkvsptgtltieevr 383 

Qy 61 RSDAGYYICQALTVAGSILAKAQLE V 86 

: I I hi : III hll I: I 
Db 384 qvdegayvcagmnsagsslskaalkatfetkgrvqkkkskmgkqkqknvqsiikylisav 443 

Qy 87 TDVLTDRPPPIILQGPANQTLAVDGTALLKCKATGDPLPVISWLKEGFTFPGRDPRATIQ 146 

I MM I llll I Ml hhl I I Illl::| I I : 
Db 444 tgntpakppptiehghqnqtlmvgssailpcqasgkptpgiswlrdglpiditdsrisqh 503 

Qy 147 EQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLDVTE-SGATISKNYDLSDLPGPPS 204 

hi I M M| ||hl : MM ||:|| : I I: I ': 
Db 504 stgslhiadlkkpdtgvytciaknedgestwsasltvedhtsnaqfvrmpdpsnfpsspt 563 

Qy 205 KPQVTDVTKN3VTLSWQ-PGTPGTLPASAYIIEAFSQSVSNSWQTVANHVKTTLYTVRGL 263 

:| : :|l ' I II III I : III: :| : :| : M :| I Ml 
Db' 564 qpiivnvtdtevelhwnapstsgagpitgyiiqyyspdlgqtwfnipdyvasteyrikgl 623 

Qy 264 RPNTIYLFMVRAIN PKVS — VTQXKP 287 

M MMI | MIM II 
Db 624 kpshsymfviraenekgigtpsvssalvttskp 656 
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24-JUL-1999 (first entry) 



RESULT 7 
Y08403 

ID Y08403 standard; Protein; 1297 AA. 
XX 

AC Y08403; 
XX 
DT 
XX 

DE C. elegans ROBO protein. ' 
XX 

KW ROBOl; ROB02; roundabout; nerve guidance; human; murine; cell function; 
KW cell morphology; screening (assay. 

XX 

OS Caenorhabditis elegans. I 

XX i 

PN WO9920764-A1. 
XX 

PD 29-APR-1999. 
XX 

20-OCM998; 



14-NOV-1997; 
20-OCM997; 



97US-097117; 
97US;006292: 



»R 
PR 
XX 
PA 
XX 

PI 

DR WPI; 1999-312615/46 1 . 
DR N-PSDB; X57252. 



^OOdmE 



lan CS, Kidd TJ Mitctiell KJ, Tear 6; 



'PT Robo polypeptides, a new immunoglobulin superfamily member 
XX 

PS Claim 1; Page 59-63; 80pp; English. 
XX 

CC This invention describes novel Robo (roundabout) polypeptides, involved 

CC in nerve guidance which heve been isolated from Drosophila sp., 

CC C. elegans, human and murine samples. The products of the invention can 

CC be used to raise anti-Robo ■antibodies, which can be used to modulate cell 

CC function or morphology. The Robo polynucleotides and fragments are useful 

CC as probes and primers and for production of the Robo polypeptides. The 

CC probes and primers are also useful in screening assays, 
xx 

SO Sequence 1297 AA; 



Query Match 24.61; Score 561.5; DB 20; Length 1297; 

Best Local Similarity 37.2%;' Pred. No. 1.9e-29; 
^ Matches 124; Conservative 48; Mismatches 114; Indels 47; Gaps 6; 

"y 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

I I I Ml I I I II II III MM: : I lllll III 
Db 325 qsvpaggtatfectlvgqpspayfwskegqqdllfpsy-vsadgrtkvsptgtltieevr 383 

Oy 61 RSDAG Y Y ICQALT VAGS I LAKAQLE V 86 

: I I 1:1 : III Ml h I 
Db 384 qvdegayvcagmnsagsslskaalkatfetkgrvqkkkskmgkqkqknvqsiikylisav 443 

Qy 87 TDVLTDRPPPIILQGPANQTLAVDGTALLKCKATGDPLPVISWLKEGPTFPGRDPRATIQ 146 

I Ml I I Ml I :|:| |:|:| I I MM I I : 
Db 444 tgntpakppptiehghqnqtlmvgssailpcqasgkptpgiswlrdglpiditdsrisqh 503 

Qy 147 EQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLDVTE--SGATISKNYDLSDLPGPPS 204 

hi I :|: III lll:| : MM! MM : I IM |: 
Db 504 stgslhiadlkkpdtgvytciaknedgestwsasltvedhtsnaqfvrmpdpsnfpsspt 563 

Qy 205 KPQVTDVTKNSVTLSWQ-PGTPGTLPASAYIIEAFSQSVSNSWQTVANHVKTTLYTVRGL 263 

:M :|| I I I I I I M III: :| : :| : ::l :l I -ll 
Db 564 qpiivnvtdtevelhwnapstsgagpitgyiiqyyspdlgqtwfnipdyvasteyrikgl 623 



Oy 264 RPNTIYLFMVRAIN-- 
:|: |:|::M I 



--PKVS — VTQXKP 287 

I II II II 



Db 624 kpshsymfviraenekgigtpsvssalvttskp 656 



Y13563 : 

ID Y13563 standard; Protein; 1395 AA. 

XX 

AC Y13563; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE Drosophila Robo 1 polypeptde. 
XX 

KW Com polypeptide; Robo polypeptide; commissureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Drosophila sp. , 
XX ' ' 

PN W09925833-A1. : 
XX 

PD 27 -MAY- 1999. ' 
XX 

PF 13-NOV-1998; 98WO-US24327 . 

PR 14-NOV-1997; 97US-0065543. 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 
XX 

DR WPI; 1999-338008/28. 

DR N-PSDB; X55767. , 
XX 

PT Modulation of Robo-Comm polypeptide interactions 
XX 

PS Disclosure; Page 30-33; 56pp; English. 

XX 1 , 

CC The invention relates to a method for modulating the amount of Comm 

CC (commissureless) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Comm polypeptide in contact with the cell, where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Comm in contact with the cell , 

CC The method, is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo: Comm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function. • 

XX 

SQ Sequence 1395 AA; 



Query Match 23.9%; Score 545; DB 20; Length 1395; 

Best Local Similarity 40.9*; Pred. No. 2.6e-28; 

Matches 115; Conservative 48; Mismatches 104; Indels 14; Gaps 7; 

Qy 9 VTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAGYYI 68 

I II III hill III Mil M I: M II:::: I III: 
Db 362 vqlpcmasgnpppsvfwtkegvstlmfpn---sshgrqyvaadgtlqitdvrqedegyyv 418 

Qy 69 CQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKATGDPLPVIS 128 

1:1 : |:|: | MM MUM I I Ml: I I 
Db 419 csafswdsstvrvflqvssvderpppiiqigpanqtlpkgsvatlpcratgnpsprik 477 

Qy 129 WLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLDVTESGA 188 

I :| : I : : :|:: MMMIII I: II M I I : |: 
Db 478 wfhdghavqagn-rysiiqgsslrvddlqlsdsgtytctasgergetswaatltvekpgs 536 

Qy 189 MSKNYDLSDLPGPPSKPQVTDVTKNSVTLSW QPGT PGTLPASAYI IEAFSQSV 242 

M : III |:| :M Ml I MM I :| II : 
Db 537 tslhraadpstypappgtpkvlnvsrtsislrwaksqekpgavg-piigytveyfspdl 594 

Qy 243 SNSWQTVANKVKTTLYTVRGLRPNTIYLFMVRAINPK-VSV 282 
I 1:1 I MM 1 : 1 : 1 1 1 I : M 
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Db 595 qtgwivaahrvgdtqvtisgltpgtsyvflvraentqgisv 635 



RESULT 9 
Y08401 

ID Y08401 standard; Protein; 1$5 AA. 
XX 

AC Y08401; 
XX 

DT 24-JDL-1999 (first entry) ' 

XX 

DE Drosophila sp. ROBOl proteii ! 
XX 

KW ROBOl; ROB02; roundabout; nftve guidance; human; murine; cell function; 
KW cell morphology; screening t jsay. 
XX 

OS Drosophila sp. 



WO9920764-A1. 



20-OCT-1998; 98WO-US22164 



14-NOV-1997; 97US-0971172 
20-OCT-1997; 97aS-0062921 



(REGC ) UNIV CALIFORNIA. 
Goodman CS, Kidd T, 



Mitch ll 



WPI; 1999-312615/26. 
N-PSDB; X57250. 



. KJ, Tear G; 



Robo polypeptides, a new int inoglobulin super family member 

Claim 1; Page 45-49; 80pp; 1 iglish. 

This invention describes nol il Robo (roundabout) polypeptides, involved 

in nerve guidance which hev been isolated from Drosophila sp,, 

C. elegans, human and murin samples. The products of the invention can 

be used to raise anti-Robo Itibodies, which can be used to modulate cell 

function or morphology. The jobo polynucleotides and fragments are useful 
as probes and primers and f production of the Robo polypeptides. The 

probes and primers are also iseful in screening assays. 

Sequence 1395 AA; 



Query Match 23.9%; Score 545; DB 20; Length 1395; 

Best Local Similarity 40.9%; Pred. No. 2.6e-28; 

Matches 115; Conservative 48; Mismatches 104; Indels 14; Gaps 

Qy 9 VTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAGYYI 68 

I II' III 1:111 III I : I ! I : I I: II II:::: I III: 
Db 362 vqlpcmasgnpppsvfwtkegvstlmfpn---sshgrqyvaadgtlqitdvrqedegyyv 418 

Qy ' 69 CQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKATGDPLPVIS 128 

^ 1:1: I :MIII| Mill I I |:|||:| I I 

Db 419 csafsvvdsstvrvflqvssv-derpppiiqigpanqtlpkgsvatlpcratgnpsprik 477 

Qy 129 WLKEGPTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLDVTESGA 188 

I :| : I :| : :|:: :|::||:|llll I: II Ihl I I : |: 
Db 478 wfhdghavqagn-rysiiqgsslrvddlqlsdsgtytctasgergetswaatltvekpgs 536 

Qy 189 T • ISKNYDLSDLPGPPSKPQVTDVTRNSVTLSW QPGTPGTLPASAY I IEAFSQSV 242 

I : : II III hi M:: |::| I :|| I I I :| II : 
Db 537 ts 1 hraadps typappgtpkvlnvsr ts is lrwak sqekpgavg - -pi igy tvey f spdl 594 

Qy 243 SNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINPK-VSV 282 

I I: I I I: III I hhlll I : :|| 
Db 595 qtgwivaahrvgdtqvtisgltpgtsyvflvraentqgisv 635 



RESULT 10 
Y08402 

ID Y08402 standard; Protein; 1380 AA, 

XX 

AC Y08402; 
XX 
DT 
XX 



24-JUL-1999 (first entry) 

Drosophila sp. ROB02 extracellular domain protein. 

ROBOl; R0B02; roundabout; nerve guidance; human; murine; cell function; 
cell morphology; screening assay. 



WO9920764-A1. 
29-APR-1999. 



20-OCT-1998; 98WO-US22164. 



14-NOV-1997; 
20-OCT-1997; 



97US-0971172. 
97US-0062921. 



(REGC ) UNIV CALIFORNIA. 

Goodman CS, Kidd T, Mitchell KJ, Tear G; 

WPI; 1999-312615/26. 
N-PSDB; X57251. , 

Robo polypeptides, a new immunoglobulin superfamily member 
Claim 1; Page 52-56; 80pp; English. 

This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C. elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides , The 
probes and primers are also useful in screening assays. 

Sequence 1380 AA; 



Query Match 20.8%; Score 473.5; DB 20; Length 1380; 

Best Local Similarity 28,6%; Pred, No, l.Se-23; 

Matches 134; Conservative 66; Mismatches 182; Indels 87; Gaps 15; 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGD--LTITN 58 

1:1 I I H: |:|:| ::| ||: :|| I : :::| I |:| 
Db 306 qlveigdevlfecqanghprptlywsvegnsslllpgy-rdgrntevtltpegrsvlsiar 364 

Qy 59 IQRSDAGYYI-CQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKC 117 

I: : I II II: :*. : I I : Mill III Mil I :| I 
Db 365 faredsgkwtcnalnavgsvssrtvvsv-dtqfelpppiieqgpvnqtlpvksiwlpc 423 

Qy 118 KATGDPLPVISWLKEGFTFPGRD-PRATIQEQGTLQIKNL-RISDTGTYTCVATSSSGEA 175 

: I 1:1 :ll, :| : : I : : I I I : I I I I lllll:: :|:: 
Db 424 rtlgtpvpqvswyldgipidvqeherrnlsdagaltisdlqrhedeglytcvasnrngks 483 

Qy 176 SWSAVLDV- ■ -TESGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSW-QPGTPGTLPAS 231 

III I : J : :ll Mil III: : HIIIIII : I 

Db 484 swsgylrldtptnpnikffrapelstypgppgkpqmvekgensvtlswtrsnkvggsslv 543 

Qy 232 AYIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINPKVSVTQXKPQKNN 291 

1:11 I ::':: II |: I :| II I I |::|l I 
Db 544 gyviemfgknetdgwvavgtrvqnttftqtgllpgvnyffliraensh 591 



292 GSTWANVPLP PPPVQPLP - GTELEHYAVEQQE - - 

,:. II I :|: II : :: I 



--NGYDSDSWCPPLPV 336 

I II I 
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Db 592 

Qy 



-■glslpspmsepitvgtryfnsgldlsearasllsgdwelsnaswdstsiiikltw 646 



337 Q TYLHQGLEDELEEDDDRVPTPPVRGVASSPAISFGQQSTATLTPSPRE 385 

I II :| I I 1:1 
Db 647 qiingkyvegfyvyarq lpnpivn npapvt 676 



Qy 386 EMQPMLQASPXFTSSQRPRPTSPFSTDSNTSAALSQSQRPRPTKKHKGG 434 

I I ::l :: II I :ll :l I : II 
Db 677 sntnpllgststsasasasasalistkpniaaa—gkrdgetnqsggg 722 



RESULT 11 
• Y13564 

ID Y13564 standard; Protein; 1381 AA. 
XX 

AC Y13564; 
XX 

DT 30-JDL-1999 (first entry) 
XX 



XX 

I 



Drosophila Robo 2 polypeptde. 

Coram polypeptide; Robo polypeptide; commissureless; roundabout; 
modulation; nerve cell function. 



OS Drosophila sp. 
XX 

PN W09925833-A1, 
XX 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; 98WO-US24327. 
XX 

PR 14-KOV-1997; 97OS-0065543. 
XX 

PA (REGC ) ONIV CALIFORNIA. 
XX 

PI Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 

DR WPI; 1999-338008/28. 

DR N-PSDB; X55768. 
XX 

PT Modulation of Robo-Comm polypeptide interactions 

XX 

PS Disclosure; Page 34-38; 56pp; English. 
XX 

CC The invention relates to a method for modulating the amount of Coram 

CC (commissureless) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Conn polypeptide in contact with the cell, where the 

^ amount of expressed active Robo is specifically modulated inversely with 
the modulation of the effective amount of Comm in contact with the cell. 
The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo:Comm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function . 
XX 

SQ Sequence 1381 AA; 



Query Match 20.8*; Score 473.5; DB 20; Length 1381; 

Best Local Similarity 28,64; Pred. No. 1.5e-23; 

Matches 134; Conservative 66; Mismatches 182; Indels 87; Gaps 15; 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNILFPNQPQQPNSRCSVSPTGD--LTITN 58 
hi I I I h hhl ::l II: :ll I : :::l I hi 
' Db 306 qlveigdevlfecqanghprptlywsvegnsslllpgy-rdgrmevtltpegrsvlsiar 364 

Qy 59 IQRSDAGYYI-CQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKC 117 

t I hi : I II II::: : I I : Mill ||| II I :| I 
Db 365 faredsgkwtcnalnavgsvssrtwsv-dtqfelpppiieqgpvnqtlpvksiwlpc 423 

Qy 118 KATGDPLPVISWLKEGFTFPGRD-PRATIQEQGTLQIKNL-RISDTGTYTCVATSSSGEA 175 

: I hi ill :| :: I : : I I I :| I I I HIM:: :|:: 



Db 424 rtlgtpvpqvswyldgipidvqeherrnlsdagaltisdlqrhedeglytcvasnrngks 483 



Db 


424 


Qy 


176 


Db 


484 


Qy 


232 


Db 


544 


Qy 


292 


Db 


592 


Qy 


337 


Db 


647 


Qy 


386 


Db 


677 



III I : 



M Mil || 



hll I 



II hi M II 



MUM : 



I I : : 1 1 I 



■-NGYDSDSWCPPLPV 336 

I II I 



- -> -TYLHQGLEDELEEDDDRVPTPPVRGVASSPAISFGQQSTATLTPSPRE 385 

'.II '-III hi 
egfyvyarq lpnpivn npapvt 676 



I I ■ =:l 



II I Ml : I 



RESULT 12 
R71726 

ID R71726 standard; Protein; 1911 AA. 
XX 

AC R71726; 
XX 

DT 17-OCT-1995 (first entry) 
XX 

DE Human PTP-OB. ■ 

xx : 

KW PTP-OB; protein tyrosine phosphatase; osteoblast; differentiation; 

KW osteoclast; osteoporosis; bone; cancer; osteosarcoma, 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 
FT Peptide 1..29 
FT /label- Sig_peptide 

FT Modified-site . 250 

FT , /label- N-glycosylation_site 

FT Modified-site ,721 

FT /label- N-glycosylation_site 

FT Modified-site . 919 

FT ■ /label- N-glycosylation_site 

FT Domain ,1253.. 1277 

FT /label- Extracellularjomain 

XX 

PN WO9507935-A. ' 
XX 

PD 23-MAR-1995. 
XX 

PF 09-SEP-1994; '94WO-OS10166. 
XX 

PR 14-SEP-1993; ' 93US- 0122032 . 
XX 

PA (MERI ) MERCK S CO INC. 
XX 

PI Rodan GA, Rutledge SJ, Schmidt A; 
XX 

DR WPI; 1995-131318/17. 
DR N-PSDB; Q86478.' 
XX 

PT Protein tyrosine phosphate protein PTP-OB specifically expressed 
PT ' in bone cells t modulators of which are used to treat, e.g. 
PT osteoporosis, and prevent and treat bone loss and cancer. 
XX 

PS Claim 1; Page 44-45; 63pp; English. 
XX 

CC PCR amplification of cDNA derived from human osteosarcoma 
CC Saos-2/B10 using primers based on conserved regions of protein 
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CC tyrosine phosphatases and subsequent screening of a human 

CC brain cDNA library yielded a cDNA clone (sequence given in 

CC Q86473) that encoded a novel human protein, PTP-OB (R71726). 

CC Recombinant PTP-OB was expressed in E, coli, yeast, insect 

CC and mammalian cells. 
XX 

SO Sequence 1911 AA; 



Query Match 14.2%; Score 324.5; DB 16; Length 1911; 

Best Local Similarity 23.2%; Pred. No. l.Be-13; 

Matches 145; Conservative 71; Mismatches 187; indels 223; Gaps 27; 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

II II :l h hhl I I 1 = 1 : I : III:: 
Db 42 qigvsgrvasfvcqatgdpkprvtwnkkgkk — vnsqrfetiefdesagavlriqplr 97 




61 R-SDAGYYICQALTVAGSILAKAQLEVTDVLTDRPP- ■ -PIILQGPANQTLAVDGTALLK 116 

t III I I 1:1 I : I: I I I II : : II : 
98 tprdenvyecvaqnsvgeitvhakltv -lredqlpsgf pnidmgpqlkwertrtatml 155 



Qy 117 CKATGDPLPVISWLKEGFTFPGRDPRAT— IQE-QGTLQIKNLRISDTGTYTCVATSS 171 

' I hhl I 1:1 h I II I: I:: I III:: :| I I III: 
Db 156 caasgnpdpeitwfkd— flpvdpsasngrikqlrsgalqiesseetdqgkyecvatns 212 

Qy (172 SG EASWSA 179 

Db 213 agvrysspanl^vrvrrvaprfsilpmsheimpggnvnitcvavgspmpyvkwmqgaedl 272 

Qy 180 VLDVTESGATISKNY DLSDLPGPPSKPQVTD 210 

l|::|: III : II I I II: 

Db 273 tpedaipvgrnvleltd--vkdsanyhpcvamsslgvieavaqitvkslpkapgtpmvte 330 

Qy 211 VTKNSVTLSWQPGIPGTLPASAYIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTnL 270 

I l:h:l II II hll hi :| : : II |:: II II: I 
Db 331 ntatsititwdsgnpd-pvsyyvieyksksqdgpyq-ikeditttrysigglspnseye 387 

Qy 271 FMVRAINPKVSVTQXKPQKNNGS-TWANVPLPPP PVQP— 307 

I 1:1 I: I I :: : I I II l|:| 
Db 388 iwsavn— sigqgppseswtrtgeqaparpprnvqarmlsattmivqweepvepngl 444 

Qy 308 LPGTELEHYAVEQQENGYDSDSWCPPLPVQTYLHQGLED ELEEDD 352 

: I : :| :| I II : ::| I II: 

Db 445 irgyrv-yytme pehpvgnwqkhnvddsllttvgslledetytvrvla 491 

fly 353 DRVPTPPVRGVASSP—AISFGQQSTATLTPSPR 384 

■ I : :|| I ::: II: II 

P> 492 ftsvgdgplsdpiqvktqqgvpgqpmnlraearsetsitlswspprqesiikyellfreg 551 

Qy 385 EEMQP --MLQASP — XFTSSQRPR PTSP- 408 

I:::| : II II I I |::| 
Db 552 dhgrevgrtfdpttsywedlkpnteyafrlaarspqglgaftpwrqrtlqskpsappq 611 

Qy 409 - FSTDSNTSAALSQSQRPRPTKKHKG 433 

I II: 1111:11 
Db 612 dvkcvsvrstailvsvrppppethng 637 



RESULT 13 
W27225 

ID W27225 standard; Protein; 1911 AA. 
XX 

AC W27225; 
XX 

DT 19-DEC-1997 (first entry) 

XX 

DE Human protein tyrosine phosphatase PTP-OB. 
XX 

KW Protein tyrosine phosphatase' PTP-OB; PTPepsilon; osteoblast; 
KW recombinant protein; growth; differentiation; brain; human, 

XX 

OS Homo sapiens. 

XX 



PN US5658756-A, 

PD 19-AUG-1997. ! 
XX 

PF 14-SEP-1993; 93US-0122032 . 
XX 

PR 01-DEC-1994; 94US-0348006. 

PR 14-SEP-1993; 93US-Q122032, 
XX 

PA (MERI ) MERCK S" CO INC. 
XX 

PI Rodan GA, Rutle'dge SJ, Schmidt A; 
XX 

DR WPI; 1997-424232/39. 

DR N-PSDB; T85389.. 
XX 

PT DNA encoding protein tyrosine phosphatase PTP-OB - isolated from 

PT human osteoblasts and useful for production of recombinant PTP-OB 
XX 

PS Claim 1; Column. 23-34; 34pp; English, 
XX 

CC The present sequence represents human protein tyrosine phosphatase 

CC (PTP-OB) protein. The DNA encoding this protein is useful for the 

CC production of the recombinant protein, which is a protein tyrosine 

CC phosphatase which may be involved in the growth and differentiation 

CC of osteoblasts and brain cells and is useful for identifying compounds 

CC that modulate PTP-OB activity and as a therapeutic agent for treating 

CC PTP-OB-related diseases, 
xx 

SQ Sequence 1911 AA; 



Query Match 14.2%; Score 324.5; DB 18; Length 1911; 

Best Local Similarity 23,2%; Pred. No, 1.8e-13; 

Matches 145; Conservative 71; Mismatches 187; Indels 223; Gaps 27; 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

II II :!' I: hhl II hi : I : III:: 
Db 42 qigvsgrvasfvcqatgdpkprvtwnkkgkk--- -vnsqrfetiefdesagavlriqplr 97 

Qy 61 R-SDAGYYICQALTVAGSILAKAQLEVTDVLTDRPP-- -PIILQGPANQTLAVDGTALLK 116 

I ' I I'l I I |:M : I: I I I II : : II : 
Db 98 tprdenvyecvaqnsvgeitvhakltv -lredqlpsgf pnidmgpqlkwertrtatml 155 

Qy 117 CKATGDPLPVISWLKEGFTFPGRDPRAT— IQE-QGTLQIKNLRISDTGTYTCVATSS 171 

I l:hl I hi I: I II h h: I III:: :| I I lllhl 
Db 156 caasgnpdpeitwfkd---flpvdpsasngrikqlrsgalqiesseetdqgkyecvatns 212 

Qy 172 SG ; EASWSA 179 

:l ' I 
Db 213 agvrysspanl'yvrvrrvaprfsilpmsheimpggnvnitcvavgspmpyvkwmqgaedl 272 

Qy 180 -'VLDVTESGATISKNY DLSDLPGPPSKPQVTD 210 

J|::|: III : II I I II: 

Db 273 tpeddmpvgrnvleltd-vkdsanyhpcvamsslgvieavaqitvkslpkapgtpmvte 330 

Qy 211 VTKNSVTLSWQPGTPGTLPASAYIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIYL 270 

I hl::|< II II hll 1:1 :| : : II |:: II l|: I 
Db 331 ntatsititwdsgnpd-pvsyyvieyksksqdgpyq-ikeditttrysigglspnseye 387 

Qy 271 FMVRAINPKVSVTQXKPQKNNGS-TWANVPLPPP PVQP— 307 

I 1:1 I: I I :: : I Ml Ihl 
Db 388 iwvsavn— sigqgppseswtrtgeqaparpprnvqarmlsattmivqweepvepngl 444 

Qy 308 LPGTELEHYAVEQQENGYDSDSWCPPLPVQTYLHQGLED ELEEDD 352 

: I : :| : I II : ::| I ||: 

Db 445 irgyrv-yytme pehpvgnwqkhnvddsllttvgslledetytvrvla 451 

Qy 353 DRVPTPPVRGVASSP— AISFGQQSTATLTPSPR 384 

I : :|| I ::: II: II 

Db 492 ftsvgdgplsdpiqvktqqgvpgqpmnlraearsetsitlswspprqesiikyellfreg 551 

Qy 385 ■ EEMQP MLQASP XFTSSQRPR PTSP-- 408 
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::: : II II I I |::| 

Db 552 dhgrevgrtfdpttsyvvedlkpnteyafrlaarspqglgaftpvvrqrtlqskpsappq 611 

, Qy 409 -FSTDSNTSAALSQSQRPRPTKKHKG 433 
I I I: I II I : II 
Db 612 dvkcvsvrstailvswrppppethng 637 



RESULT 14 
W94027 

ID W94027 standard; Protein; 1911 AA. 
XX 

AC W94027; 
XX 

DT 01-APR-1999 (first entry) 
XX 

DE Human protein tyrosine phosphatase (PTP-OB) . 



Protein tyrosine phosphatase; PTP; PTP-0 
osteoporosis . 



Homo sapiens. 



"PN US5866397-A. 
XX 

PD 02-FEB-1999. 
XX 
PF 
XX 
PR 
PR 
PR 
XX 
PA 
XX 

PI 

XX 



bone; brain; cancer; 



14-FEB-1997; 97US-0800825 



01-DEC-1994; 
14-SEP-1993; 
14-FEB-1997; 



94OS-Q3480061 
93DS-0122032I 
97OS-08008251 



(MERI ) MERCK 4 CO INC. 

Rodan GA, Rutledge SJ, 



WPI; 1999-141930/12, 
N-PSDB; X06095. 



Schmidt A; 

I 



Protein tyrosine phosphatase denoted PTP-OB - useful for drug 
screening 

Claim 1; Columns 23-32; 34pp; English, 

This represents a human protein tyrosine phosphatase (PTP) denoted as 
PTP-OB, produced by bone and brain cells. A recombinant host cell 
transfected or transformed jrith a nucleic acid vector comprising the 
nucleic acid can be used for the production of the PTB-OB polypeptide. 
The protein can be used to screen for modulators of PTP-OB activity, 
which might be useful for treating e.g. osteoporosis and cancer. 

Sequence 1911 AA; 



Query Match 14.2%; Score 324.5; DB 20; Length 1911; 

Best Local Similarity 23.2%; Pred, No. 1.8e-13; 

Matches 145; Conservative 71; Mismatches 187; Indels 223; Gaps 27; 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

II II :| |: hl:| I J hi I : III 

Db 42 qigvsgrvasfvcqatgdpkprvtwnkkgkk---vnsqrfetiefdesagavlriqplr 97 

Qy 61 R-SDAGniCQALTVAGSILAKAQLEVTDVLTDRPP— PIILQGPAKQTLAVDGTALLK 116 

I I I I I I 1:1 I : I: I I I II : : II : 
Db 98 tprdenvyecvaqnsvgeitvhakltv--lredqlpsgfpnidmgpqlkwertrtatml 155 

Qy 117 CKATGDPLPVISWLKEGFTFPGRDPRAT--IQE--QGTLQIKNLRISDTGTYTCVATSS 171 

I 1:1: I 1:1 I: I II I: |:: I III:: :l I I lllhl 
Db 156 caasgnpdpeitwfkd—flpvdpsasngrikqlrsgalqiesseetdqgkyecvatns 212 



Qy 172 SG-- 

:| 



■ 179 



Db 213 agvrysspanlyvrvrrvaprfsilpmsheimpggnvnitcvavgspmpyvkwmqgaedl 272 

Qy 180 VLDVTESGATISKNY DLSDLPGPPSKPQVTD 210 

!l|::|: III : II I I lh 

Db 273 tpeddmpvgrnvleltd- -vkdsanyhpcvamss lgvieavaqitvkslpkapgtpravte 330 

Qy 211 VTKNSVTLSWQPGTPGTLPASAYIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIlfL 270 

I :!:: II II I = M hi i| : : II h: II lh I 
Db 331 ntatsititwdsgnpd-pvsyyvieyksksqdgpyq-ikeditttrysigglspnseye 387 

Qy 271 FMVRAINPKVSVTQXKPQKNNGS - TWANVPLPPP PVQP — 307 

I hi h I I :: : I III Ihl 
Db 388 iwvsavn---s'igqgppsesvvtrtgeqaparpprnvqarmlsattmivqweepvepngl 444 

Qy 308 LPGTELEHYAVEQQENGYDSDSWCPPLPVQTYLHQGLED ELEEDD 352 

: I : :| :| III: ::| I lh 

Db 445 irgyrvyytme pehpvgnwqkhnvddsllttvgslledetytvrvla 491 

Qy 353 URVPTPPVRGVASSP — AISFGQQSTATLTPSPR 384 

I : ill I ::: lh II 

Db 492 f ts vgdgplsdpiqvktqqgvpgqpmnl raearsets itlswspprqes i ikyell f reg 551 

Qy 385 '- EEMQP MLQASP — XFTSSQRPR PTSP— 408 

h::| : II II I I h:| 
Db 552 dhgrevgrtfdpttsywedlkpnteyafrlaarspqglgaftpwrqrtlqskpsappq 611 

Qy 409 -FSTDSNTSAALSQSQRPRPTKKHKG 433 

I li: I II I : I I 
Db 612 dvkcvsvrstailvswrppppethng 637 



RESULT 15 
R72858 

ID R72858 standard; Protein; 1501 AA. 
XX 

AC R72858; 
XX 

DT 21-NOV-1995 (first entry) 
XX 

DE Rat receptor type-protein tyrosine phophatase sigma, 
XX 

KW Receptor type tyrosine phosphatase sigma; cell; differentiation; 
KW metabolism; cell .cycle; behaviour; motility; contact inhibition; 
KW virus; inflammation; cellular transformation; cancer; 
KW neuroblastomas; antibody; detection; quantification, 
XX 

OS Rattus rattus, . 
XX 

PN WO9509656-A. 
XX 

PD 13-APR-1995.. 
XX 

PF 30-SEP-1994; 94WO-OS11163. 
XX 

PR 01-OCT-1993; 93US-0130570 . 
XX 

PA (UYNY ) UNIV NEW YORK STATE. 
XX 

PI Schlessinger J, Yan H; 
XX 

DR WPI; 1995-155068/20. 
DR N-PSDB; Q86902. 
XX 

PT Novel, isolated receptor-type protein tyrosine phosphatase-sigma 
PT ■ and encoding DNA, useful e.g. for detecting neuro-blastomas 

XX 

PS Claim 2; Figure 2; 105pp; English. 
XX 

CC Ligands binding to the receptor-type protein tyrosine phosphatase 

CC sigma (RPTP sipa) protein may be used as drugs to modulate cellular 

CC processes, such as differentiation, metabolism and cell cycle 

CC control, and cellular behaviour such as motility and contact 

CC inhibitions. In addition they may affect abnormal or potentially 
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CC deleterious processes such as virus -receptor interactions, 

CC inflammation and cellular transformation to a cancerous state. They 

CC may also be used to treat RPTP Sigma related neuronal disorders such 

CC as neuroblastomas. The DNA encoding the RPTP sigma is useful for 

CC the diagnosis of diseases resulting from its aberrant expression. 

CC Antibodies directed against RPTP sigma may be used in detection and 

CC quantitative analysis, 
xx 

SQ Sequence 1501 AA; 



Query Match 14.2%; Score 323.5; DB 16; Length 1501; 

Best Local Similarity 25 .4%;i Pred. No. 1.6e-13; 

Matches 124; -Conservative 162; Mismatches 190; Indels 113; Gaps 21; 

Qy i qivaqgrtvtfpcetkgnpqpaIvfwqkegsqnllfpnqpqqpnsrcsvsptgdltitniq 60 

::i : lj I I III I !: I I: I I I I :| I I : : 
LDb 144 kvvertrtatmlcaasgnpdpejitwfkd flpvdpsasngrikqlrsgalqiesse 198 

FQy 61 RSDAGYYICQALTVAG- - -SILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKC 117 
:| Mil I II I II : I I I II I : : I : I 
Db 199 etdqgkyecvatnsagvrysspanlyvrvrrv---aprfsil--pmsheimpggnvnitc 253 

Qy 118 ratgdplpviswlkegftfpgJdpratiqeqgtlqiknlrisdtgtytcvatsssgeasw 177 

I 1:1 : I:: I I :: : I: Mill II I 

Db 254 vavgspnipyvkwmqgaedltpecldnipv — grnvleltdvkdsanytcvamsslg — 305 

Qy 178 SAVLDVTESGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEA 237 

I j: II: : l|l I I II: I |:|::| II II Mil 
Db 306 vi|avaqit-"-vksljpkapgtpvvtentatsitvtwdsgnpd"pvsyyviey 354 

Qy 238 FSQSVSNSWQTVANHVKTTLYTVRGLRPNT I YLFMVRAINPKVSVTQXKPQKNNGS ■ TWA 296 

hi : : II h|: -II II: I I hi h I I :: : I 
Db 355 ksksqdgpyq-ikeditttrys;igglspnseyeiwvsavn—sigqgppseswtrtge 410 

Qy 297 KVPLPPP | PVQP — LPGTELEHYAVEQQENGYDSDSWCPPL 334 

I I I 11:1 : I : :| :| I 
Db 411 qapasaprnvqaolsattinivjjweepvepnglirgyrv-yytme peh 457 

Qy 335 PVQTYLHQGLED ELEEDD DRVPTPPVRGVASSP" 367 

II : J ::| I ! l h I : :|| I 

Db 458 pvgnwqkhnvddsllttvgslledetytvrvlaftsvgdgplsdpiqvktqqgvpgqpmn 517 

Qy 368 --AISFGQQSTATLTPSPREEHQPMLQASPXFTSSQRPR' 



I : ',: I : 1 1 : 1 
518 Iraeaksetsiglswsaprqe 

416 SAALSQSQR 424 

I :;l 
576 eyafrlaar 584 



Search completed: January 22, 2001, 12:19:47 
Job time: 1744 sec i 
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January 22, 2;001, 12:27:16 ; Search time 325.28 Seconds 
1 (without alignments) 

; 90t595 Million cell updates/sec 

US-09-540-245'a-19 ! 
2280 , j 

1 QIVAQGRTVTFPCETKGNPQ {TSAALSQSQRPRPTKKHKGG 434 



Title: 

Perfect score: 
Sequence!, 

1 f 

Scoring table: 

Gapop'10.0 , 'Gapext 0.5 

parched: 195891 seqs, |67900655 residues 

Total number of hits satisfying chosen parameter! 

Minimum D,B seq length: 0 j 

© seq length: 2000000,000 

Post-proiessing: Minimum Match! 0% 
Maximum Match' 100% 
Listing first: 45 summaries 

PIRJ6 : * j 
1: pirl: * I 
2: pir2:* 
3: pir3:* 
4: pir4 : * 



Pr d. No. is the number pf results predicted by chance to have a 
sc re greater than or equal to the score of the result being printed, 
and is derived by analyses of the total scire distribution. 



Result 

NO. j 


Score 


1 

Query 

Match Length [ 


. 


id ; 


Description 


1 


1 911 


40.0 1612 


2 


T30805 


duttl protein - mo 




' 911 


40.0 1651 


2 


T14160 


transmembrane rece 


1 


'755.5 


33.1 1344 


2 


T14316 


rig-1 protein ■ mo 


¥ 4 


579 


25.4 1273 


2 


T42405 


sax-3 protein - Ca 


5 


388 


17,0 874 


2 


T29548 


hypothetical prote 


6 


331.5 


14.5 2222 


2 


T13924 


sdk protein ■ frui 


7 


326 


14.3 1907 


2 


S50893 


protein-tyrosine-p 


8 


325 


14.3 1499 


2 


150212 


protein-tyrosine-p 


9 


323.5 


14.2 1501 


2 


158148 


protein-tyrosine-p 


10 


323.5 


14.2 1863 


2 


S46217 


protein-tyrosine-p 


11 


317 


13.9 1894 


2 


C54689 


protein-tyrosine-p 


12 


308.5 


13.5 1897 


1 


TDHOXK 


leukocyte antigen- 


13 


307.5 


13.5 1912 


2 


A56178 


protein-tyrosine-p 


14 


305.5 


13.4 1898 


2 


S46216 


leuxocyte antigen- 


15 


297.5 


13.0 1232 


2 


T43027 


neural cell adhesi 


16 


296.5 


13.0 1277 


2 


T30532 


neural cell adhesi 


17 


295.5 


13.0 2029 


1 


TDFFLK 


protein-tyrosine-p 


18 


292.5 


12.8 1262 


1 


B48758 


protein-tyrosine-p 


19 


292.5 


12.8 1496 


1 


A48758 


protein-tyrosine-p 


20 


282 


12,4 1443 


2 


150600 


neogenin - chicken 


21 


276 


12.1 1028 


2 


158164 


BIG-1 protein • ra 


22 


274 


12.0 1197 


2 


T30581 


neural cell adhesi 


23 


269 


11.8 1028 


2 


A53449 


plasmacytoma-assoc 


24 


268 


11.8 1257 


1 


A41060 


neural cell adhesi 


25 


266.5 


11.7 1427 


2 


151669 


tumor suppressor - 


26 


265.5 


11.6 1260 


1 


S05479 


neural cell adhesi 


27 


264.5 


11.6 1256 


2 


T03096 


CDO protein - rat 


28 


262.5 


11.5 1272 


2 


S26180 


neurofascin - chic 


29 


261.5 


11.5 1259 


2 


S36126 


neural cell adhesi 



30 


258 


11.3 


1239 


1 


A32579 


neuroglian - fruit 


31 


256.5 


11,2 


423 


2 


T29549 


hypothetical prote 


32 


256 


11. 2 


1040 


2 


A49356 


transient axonal g 


33 


255.5 


11.2 


1375 


2 


T13822 


frazzled gene prot 


34 


253 


11.1 


1018 


2 


JC4211 


neural adhesion pr 


35 


252.5 


11,1 


1259 


2 


A43425 


Bravo/Nr-CAM cell 


36 


251.5 


11.0 


1036 


2 


S22383 


axonin 1 precursor 


37 


251 


11,0 


. 761 


1 


IJHUNG 


neural cell adhesi 


38 


251 


11.0 


1880 


2 


T18531 


tractin - medicina 


39 


250.5 


11.0 


871 


1 


148696 


protein-tyrosine x 


40 


250.5 


11.0 


881 


1 


148697 


protein-tyrosine k 


41 


250.5 


11.0 


1268 


1 


A39640 


neural cell adhesi 


42 


250 


11.0 


1091 


2 


A58532 


glial cell membran 


43 


249 


10.9 


1447 


2 


A54100 


tumor suppressor p 


44 


248.5 


10/9 


1437 


2 


T31093 


probable protein-t 


45 


248 


10.9 


725 


1 


IJMSNG 


neural cell adhesi 



RESULT 1 
T30805 

duttl protein - mouse 

N; Alternate names: transmembrane receptor protein Robol homolog 
C; Species: Mus musculus (house mouse) 

C;Date: 22-Oct-1999 tsequencejrevision 22-Oct-1999 ttext.change 22-Oct-1999 
C; Accession: T30805 

R;Wu, M.C.; Lowe, N, ; Fordham, R.; Rabbitts, P, 
submitted to the EMBL Data Library, July 1998 

A; Description: The mouse homologue of human DUTTl/H-robol gene: protein sequence a 
A; Reference. number: Z20879 
A; Accession: 1308,05 

A; Status : preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRM 
A; Residues: 1-1612 <UM> 

A;Cross-references: EMBL:Y17793; NID:el329712; PID;el329713; PIDN:CAA76850. 1 

A; Experimental source: brain 

C'Genetics: 

A; Gene: duttl 

A; Map position: 16 



Query Match 40,0%; Score 911; DB 2; Length 1612; 

Best Local Similarity 22.8%; Pred. No. 1.7e-46; 

Matches 254; Conservative 55; Mismatches 120; Indels 686; Gaps 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

:| llllll II lllll|:||::|||||||| II I :|| III lll|l|||:| 

Db 321 QWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSrQPPQSSSRFSVSQTGDLTITNVQ 380 

Qy 61 RSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKAT 120 

III MM I Mill: II llllll: l!l||:| III llhlllll I II 

Db 381 RSDVGyyiCQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTLILSCVAT 440 

Qy 121 GDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 

I II I I 1:1 :| I M III: :: III III |:: lllhlll 

Db 441 GSPAPTILWRKDGVLVSTQDSRIKQLESGVLQIRYAKLGDTGRYTCTASTPSGEATWSAI 500 

Qy 181 LDVTESGATIS-KNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAyilEAF 236 

::! I I : : I : :| 1 1 1 1 : 1 1 1 1 : 1 1 : || 1 1 1 1 1 ::!lllll 

Db 501 IEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSKNTVTLSWQPNLNSGATPTSYIIEAF 560 

Qy 239 SQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAIN 277 

I : :llll : I :lll : ::||:|l lllhlll I 

Db 561 SHASGSSWQTAAENVKTETFAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVPP 620 

Qy 278 277 

Db 621 T SQGVDHKQVQRELGNWLHLHNPT ILSSSSVEVHWTVDQQSQ Y IQG YKI LYRPSGASHG 680 

Qy 278 - pk 279 
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Db 681 ESFM,VFEVRTPTRNSVVIPDLRRGVNYEIRARPFFNEFQGADSEIRFARTLEEAPSAPP 740 

Qy 280 VSVT 283 

III 

Db 741 RSVTVSKNDGNGTAILVTWQPPPEDTQNGMVQEYKVWCLGNETKyHINKTVDGSTFSWI 800 

Qy 284 283 

Db 801 PSLVPGIRYSVEVAASTGAGPGVKSEPQFIQLDSHGHPVSPEDQVSLAQQISDWRQPAF 860 

Qy 284 283 

Db 861 IAGIGAACWIILMVFSIWLyRHRKRRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSG 920 

Qy 284 283 

Db 921 GRPGLLNISEPATQPWLADTWPNTGNNHNDCSINCCTAGNGNSDSNLTTYSRPADCIAljy 980 

Qy 284 283 

Db 981 NNQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLI 1040 

• 284 QXKPQ 288 
• I II: 
Db 1041 QANLSNNMNNGA^pSSEKHWKPPGQQKPEVAPIQYNIMEQNKLNKDYRANDTIPPTIPYN 1100 

Qy 289 RNNGSTWANVPLPPPPVQPLPGTE 312 

I I II:: lllll I I : 
Db 1101 QSYDQNTGGSYNSSDRGSSTSGSQGHRKGARTPRAPRQGGMNWADL-LPPPPAHPPPHSN 1159 

Qy 313 LEHYAVEQQENGYDSDSWCPPLPVQTYLHQGLEDEL-EEDDDRVPTPPVRGVASSP-AIS 370 

II: I: II : II I II I III ll:|:| lllllll III |:| 
Db 1160 SEEYNMSVDES-YDQEMPCPVPPAPMYLQQ— DELQEEEDERGPTPPVRGAASSPAAVS 1215 

Qy 371 FGQQSTATLTPSPREEMQPMLQASP 395 

: llllllllll:||:IHII I 
Db 1216 YSHQSTATLTPSPQEELQPMLQDCPEDLGHMPHPPDRRRQPVSPPPPPRPISPPHTYGYI 1275 

Qy 396 - 395 

Db 1276 SGPLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGW 1335 

Qy 396 '- XFTS 399 

I : 

Db 1336 GSASEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHA 1395 

Qy 400 SQRPRPTSPFSTDSNTSAALSQSQRPRPTKKHKGG 434 

II llllll lllll II : I II :||: I 
Db 1396 SQCPRPTSPVSTDSNMSAVVIQKARPAKRQKHQPG 1430 



i 



ISOLT 
14160 



2 



transmembrane receptor protein Robol - rat 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 20-Sep-1999 fsequence_revision 20-Sep-1999 itext_change 20-Sep-1999 
C; Accession: T14160 

R;Kidd, T. ; Brose, K.; Mitchell, K.J.; Fetter, R.D.; Tessier-Lavigne, M.; Goodman, c.S, 
Cell 92, 205-215, 1998 

A;Title: Roundabout controls axon crossing of the CNS midline and defines a novel subfan 
A;Reference number: Z17897; MOID: 98117249 
AjAccession: T14160 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues : 1-1651 <KID> 

A;Cross-references: EMBL:AF041082; NID:g2811215; PID:g2811216; PIDN:AAC39960.1 
C; Function: 

A; Description: appears to function as the gatekeeper controlling midline crossing 
C; Keywords: transmembrane protein 



Query Match 40.0%; Score 911; DB 2; Length 1651; 

Best Local Similarity 23,0*; Pred. No, 1.8e-46; 



Matches 256; Conservative 57; Mismatches 117; Indels 684; Gaps 12; 

Qy 1 QIVAQGRTVTFPCETRGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

1:11 llllll II l!llll:ll::immi II I :ll III lllll:ll:l 
Db 360 QWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTVTNVQ 419 

Qy 61 RSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKAT 120 

III llllll,') llllh II llllll: 11111:1 III lll:llill I I II 
Db 420 RSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTLTLSCVAT 479 

Qy 121 GDPLPVISWLKEGFTPPGRDPRATIQEQGTLQIRNLRISDTGTYTCVATSSSGEASWSAV 180 

11:111 1:1 :M II III: :: III III h: lllhlll 
Db 480 GSPVPTILWRKDGVLVSIQDSRIRQLESGVLQIRYAKLGDTGRYTCTASIPSGEATWSAY ' 539 

Qy 181 LDVTESGATIS-KNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEAF 238 

::| I : : I : :! 1 1 1 1 : 1 1 1 1 : 1 1 : 1 1 1 III ::l!llll 
Db 540 IEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSKNTVTLLWQPNLNSGATPTSYIIEAF 599 

239 SQSVSNSWQTVANHVKITLYTVRGLRPNTIYLFMVRAIN 277 

I : : I M I ■ I :||| : ::||:|| lllhlll I 
600 SHASGSSWQTVAENVRTETFAIRGLRPNAIYLFLVRAANAYGISDPSQISDPVRTQDVPP 659 

278 '• 277 

660 TIQGVDHKQVQRELGNWLHLHNPTILSSSSVEVHWTVDQQSQYIQGYKILYRPSGASHG 719 



720 ESEWLVFEVRTPTKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKILEERPSAPP 779 

280 VSVTQXR 286 

III I 

780 RSVTVSKNDGHGTAILVTWQPPPEDTQNGMVQEYKVWCLGNETRYHINRTVDGSTFSWI 839 
287 - PQ 288 



Db 840 PFLVPGIRYSVEVAASTGAGPGVKSEPQFIQLDSHGNPVSPEDQVSLAQQISDWKQPAF 8 



Qy 289 

Db 900 IAG IGAACWI I LMVFS IWLYRHRKKRNGLS ST YAG I RKVPS FT FT PT VT YQRGGEAVS SG 959 



--RNNG-- 

II 



Qy 293 

Db 960 GRPGLLNISEPATQPWLADTWPNTGNSHNDCSINCCTASNGNSDSNLTTYSRPADCIANY 1019 



--STWAN-- 

II I 



Qy 298 , 297 

Db 1020 NNQLDNKQTNIJMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLI 1079 

Qy 298 - VP- 299 

Db 1080 QANLINNMNNGGGDSSEKHWRPPGQQKQEVAPIQYNIMEQNKLNKDYRANDTILPTIPYN 1139 

Qy 300 LPPPPVQPLPGTEL 313 

lllll I I : 

Db 1140 HSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKAPRQGGMNWADLLPPPPAHPPPHSNS 1199 

Qy 314 EHYAVEQQENGYDSDSWCPPLPVQTYLHQGLEDELEEDD-DRVPTPPVRGVASSP-AISF 371 

I I:: |: : ll : II I : II I lllll:: :l lllllll Mil l:|: 
Db 1200 EEYSMSVDES i YDQEMPCPVPPARMYLQQ---DELEEEEAERGPTPPVRGAASSPAAVSY 1255 

Qy 372 GQQSTATLTPSPREEMQPMLQASP 395 

lllllllllhlhlllll I 

Db 1256 SHQSTATLTPSPQEELQPMLQDCPEDLGHMPHPPDRRRQPVSPPPPPRPISPPHTYGYIS 1315 

Qy 396 ; 395 

Db 1316 GPLVSDMDTD; } PEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWG 1375 

Qy 396 : XFTSS 400 

I :| 

Db 1376 SASEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHAS 1435 



i Best Available Copy 

Mon Jan 22 13:04:47 2001 us-09-540-245a-19.rpr Page 3 



401 QRPRPTSPFSTDSNTSAALSQSQRPRPTKKHKGG 434 

I llllll lllll III: I II :ll: I 
1436 QCPRPTSPVSTDSNMSAAVIQKARPTKKQKHQPG 1469 



RESULT 3 
T14316 

rig-1 protein • mouse 

C; Species: Mus musculus (house mouse) 

C;Date: 20-Sep-1999 tsequence_revision 20-Sep-1999 ftext_change 20-Sep-1999 
C;Accession: T14316 

R;Yuan, S.S.F.; Cox, L.A.; Dasika, G.K.; Lee, E.Y.H.P. 
submitted to the EMBL Data Library, April 1998 
A; Reference number: Z17975 
A;Accession: T14316 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

K Molecule type: mRNA 
Residues: 1-1344 <YUA> 
Cross-references: EMBL:AF060570; NID:g42063B5; PID:g4206386; PIDN:AAD11628.1 



Query Match 33.1%; Score 755.5; DB 2; Length 1344; 

Best Local Similarity 39.6%; Pred. No. 2.4e*37; 

Matches 176; Conservative 55; Mismatches 141; Indels 73; Gaps 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWOKEGSQNILFPNQPQQPNSRCSVSPIGDLTITNIQ 60 

I II I 1:1 lllllll 11:11111111 lllhli II I UNI II :: 
Db 334 QTVAPGANVSFQCETKGNPPPAIFWQKEGSQVLLFPSQ^LQPMGRLLVSPRGQLNITEVK 393 

Oy 61 RSDAGYYICQALTVAGSIIjARAQLEVTDVLTDRPPPIi|qGPANOTLAVDGTALLKCKAT 120 

I ll|:||l::||l!lllll II: I 1 1 1 ill 1 1 1 1 1 1 1 : : I |: 
Db 394 IGDGGYYVCQAVSVAGSILAKALLEIKGASIDGLPPIISQGPMQTLVLGSSVWLPCRVI 453 

Qy 121 GDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 

hi I I I I: I I : : : III I ::: I I hill II 1 1 1 : 1 : : 
Db 454 GNPQPNIQWKKDERWLQGDDSQFNLMDNGTLHIASIQEMDMGFYSCVAKSSIGEATWNSW 513 

Qy 181 LDVTES-GATISKNYDLSDLPGPPSKPQVTDVTKNSVTaSWQPGTPGTLPASAYIIEAFS 239 

I ! II: h llllhl Ihll 1 1 : 1 j: ! : ! h:hlMM 
Db 514 LRKOEDWGASPGPATGPSNPPGPPSOPIVTEVTANSIT|TWRPNPQSGATATSYVIEAFS 573 

Oy 240 QSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAIN- - - -PKVSVTQXKPQKNNGSTW 295 

h hhlll: I: lh 1 1 : H 1 1 1 1 ! : 1 1 1 : : I | : | 
Db 574 QAAGNTWRTVADGVQLET YTISGLQPNTIYLFLVRAVGAKGLSEPSPVSEPVQTQDSS - ' 631 

fc> 296 ANVPLPPPPVQPLPGTE * LEHYAVEQQE -\ N 323 

W i i i i i ii ii 

Db 632 — LSRPAEDPWKGQRGLAEVAV]WQEPTVLGPRTLQVSWTVDGPV<tt.VQGFRVSWRIA 687 



Db 



324 GYDSDSW CPP LPVQT YLHQGLEDE — LEEDDDRVPT 357 

I I II II : II :H I : h 

688 GLDQGSWTMLDLQSPHKQSTVLRGLPPGAQIQIKVQVQGQEGLGAESPFVTRSIPEEAPS 747 



Oy 358 PPVRGVASSPAISFGQQSTATLTPS 382 

I =11 h: I :::| I 
Db | 748 GPPQGV — AVALGGDRNSSVTVS 768 



RESULT 4 
T42405 

sax- 3 protein - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 03-Dec-1999 tsequencejrevision 03-Dec-1999 ftext_change 21-Jul-2000 
C;Accession: T42405 

R;Zallen, J. A.; Yi, B.A.; Bargmann, C.I. 
Cell 92, 217*227, 1998 

A; Title: The conserved immunoglobulin super family member SAX-3/Robo directs multiple a 
A;Reference number: Z22160; MUID:98117250 
Accession: T42405 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-1273 <ZAL> 



A; Cross -references: EMBL:AF041053; NID:g2804779; PIDN:AAC38848.1; PID:g2804780 
C; Genetics: 
A;Note: sax-3 
C; Function: 

A; Description: sax-3 function is required at the time of axon guidance 



Query Match 25.4%; Score 579; DB 2; Length 1273; 

Best Local Similarity 40.8%; Pred. No. 6.8e-27; 

Matches 124; Conservative 48; Mismatches 114; Indels 18; Gaps 6; 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

I I M I'M llllll III hlllh : I Hill III : 
Db 326 QSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSY-VSADGRTKVSPTGTLTIEEVR 384 

Qy 61 RSDAGYYICQALTVAGSILAKAQLEVTDVLTD RPPPIILQGPANQTLAVDGTALL 115 

: I I hi : III hll 1 : 1 1 Ml I I Ml I Ml 

Db 385 QVDEGAYVCAGMNSAGSSLSKAALKVTTKAVTGNTPAKPPPTIEHGHQNQTLMVGSSAIL 444 

Qy 116 KCKATGDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEA 175 

hhl I MMM II: hi I M III IMI : lh 
Db 445 PCQASGKPTPGISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGES 504 

Qy 176 SWSAVLDVTE-SGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQ-PGTPGTLPASA 232 

Ml II • ■ I I : I IM h:| : :|| lllllll: 
Db 505 TWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITG 564 

Qy 233 YIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAIN PKVS—VT 283 

III: :| : :| : ::| :| I ::||:|: hhHI I I II II 
Db 565 YI IQY YS PDLGQTWFN IPDYVASTEYRIKGLKPSHS YMFVIRAENEKG IGT PSVSSALVT 624 

Qy 284 QXKP 287 

II 

Db 625 TSKP 628 



RESULT 5 
T29548 

hypothetical protein - ZK377 . 2 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 flsequencejrevision 15-Oct-1999 Hext.change 18-Feb-2000 
C; Accession: T29548 
R;Nhan, M.; Hawkins, 'J. 

submitted to the EMBL Data Library, February 1997 
A; Description: The sequence of C. elegans cosmid ZK377. 
A; Reference number: Z20639 
A;Accession: T29548 • 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA' 
A; Residues: 1-874 <NHA> 

A;Cross-references: EMBL:U88183; PIDN:AAB52657 .1; GSPDB:GN00028; CESP:ZK377.2 

A; Experimental source: strain Bristol N2; clone ZK377 

C; Genetics: 

A;Gene: CESP-.ZK377.2' 

A; Map position: X ; : 

A;Introns: 91/2; 356/1; 452/1; 701/3; 746/3; 850/1 



Query Match ; 17.0%; Score 388; DB 2; Length 874; 

Best Local Similarity 38.8%; Pred. No. 9.4e-16; 

Matches 83; Conservative 38; Mismatches 81; Indels 12; Gaps 

Qy 86 VTDVLTDRPPPIILQGPANQTLAVDGTALLKCKATGDPLPVISWLKEGFTFPGRDPRATI 145 

II Ml I I IM I Ml hhl I I lllh:l I I : 
Db 20 VTGNTPAKPPPTIEHGHQNQTLMVGSSAILPCQASGKPTPGISWLRDGLPIDITDSRISQ 79 

Qy 146 QEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLDVTE--SGATISKNYDLSDLPGPP 203 

hi I M III IMI : 1 1 : : 1 11 I I : II : I |: I I 
Db 80 HSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSP 139 

Qy 204 SKPQVTDVTK.NSVTLSWQ - PGTPGTLPASAY 1 1 EAFSQSVSNSWQTVANHVKTTLYTVRG 262 

::| : :H ' I I I III I : 111: :| : :| : ::| :| I ::| 
Db 140 TQPIIVNVTDTEVELHWNAPSTSGAGPITGYIIQYYSPDLGQTWFNIPDYVASTEYRIKG 199 
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263 LRPNTIYLFMVRAIN PKVS— VTQIXP 287 

1:1: !:!::!! I I II I! II 
200 LKPSHSYMFVTRAENEKGIGTPSVSSALVTTSKP 233 



RESULT 6 
T13924 

sdk protein ■ fruit fly (Drosophila melanogaster) 
C; Species: Drosophila melanogaster 

C;Date: 20-Sep-1999 tsequence_revision 20-Sep-1999 ttext_change 20-Sep-1999 
C; Accession: T13924 

RjNguyen, D.N.; Liu, Y.; Litsky, M.L.; Reinke, R, 
submitted to the EMBL Data Library, February 1997 
A; Description: Sidekick, a member of the immunoglobulin superfamily, is required for pat 
A;Reference number: Z17809 
Accession: T13924 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 

A;Residues: 1-2222 <NGU> \ 
AjCross-references: EMBL:U88578; NID:g4099554; PlD:g4099555; PIDN:AAD09632.1 

^'Genetics: 

HGene: sdk 



Query Match 14.5%; Score 331.5; DB 2; Length 2222; 

Best Local Similarity 27.34; Pred. No. 6.6e-12; f 

Matches 107; Conservative 58; Mismatches 138; indels 89; Gaps 15; 

Qy 6 GRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAG 65 

I: II hi I : I : I :|| : :lll |:||: 
Db 469 GKDATISCRAVGSPNPNITW IYNETQLVDISSRVQILESGDLLISNIRSVDAP 521 



Qy .66 YYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKATGDP-L 124 

Ill III: 1:1 I I I I |:| I : |: : II 1 : 1 1 : II : 
Db 522 LYICVRANEAGSVKAEAYLSVL-VRTQ I IQPPVDTTVLLGLTATLQCKVSSDPSV 575 

Qy 125 PV-ISWLKEG-FTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVL 181 

I I I :ll I I :| I hi: -\ II hi II II I : :| I 
Db 576 PYNIDWYREGQSSTPISNSQRIGVQADGQLEIQAVRASDVGSYACWTSPGGNETRAARL 635 

Qy 182 DVTESGATISKNYDLSDLPGPPSKPQV— TDVTKNSVTLSWQPGTPGTLPASAYII— 235 

I I II III :| : : I: :ll II I I I Ml 

Db 636 SVIE LPFPPSNVKVERLPEPQQASINVSWTPGFDGNSPISKFIIQRR 682 

Qy 236 EAFSQSVSN- - - SWQTVANHVKT " • TLYTVRGLRPNT IYLFMVRAIN 277 

I I I : :1 I :: : |: hi I I |:| 

Db 683 EVSELEKFVGPVPDPLLNWITELSNVSADQRWILLENLRAATVYQFRVSAVtlRVGEGSPS 742 

I 

278 PKVSVTQXKPQKNNGSTWANVPLPP : PPVQPLPGTELEHYAVE 319 

:| : : ::| II II:: :: I : 

743 EPSNWELPQEAHSG PPVGFVGSARSMSEIITQWQPPLEEHRNGQILGYILR 794 

Qy 320 QQENGYDSDSWCPPLPVQTYLHQGLEDELEED 351 

: II:: I :| : :| : : 

Db 795 YRLFGYNNVPWS YQNITNEAQRN 817 



RESULT 7 
S50893 

protein-tyrosine-phosphatase (EC 3.1,3.48), receptor type sigma precursor - mouse 
C;Species: Mus musculus (house, mouse) 

C;Date: Ol-Aug-1995 *sequence_revision 01-Sep-1995 ttext.change 21-Jan-2000 

C; Accession: S50893; S40281 

R;Wagner, J.; Boerboom, D.; Tremblay, m.l. 

Eur. J. Biochem. 226, 773-782, 1994 

A;Title: Molecular cloning and tissue-specific RNA processing of a murine receptor-type 

A; Reference number: S50893; MUID: 95112841 

A; Accession: S50893 ^ 

A; Status; preliminary 

A; Molecule type: mRNA 

A; Residues: 1-1907 <WAG> 

A; Cross -references: EMBL:X82288; HID:g587483; PIDN:CAA57732.1; PID:g587484 



R;Hendriks, W.; Brugman, C; Zeeuwen, P.; Schepens, J,; Wieringa, B. 
submitted to the EMBL Data Library, June 1993 

A; Description: Assessment of the expression levels of murine protein -tyrosine phospha 

AjReference number: S40280 

A;Accession: S40281 « 

A; Molecule type: mRNA 

A;Residues: 1441-1501, 'E', 1503-1546 <HEN> 

A;Cross-references: EMBL:Z23050; NID:g.438137; PIDN:CAA80585.1; PID:g438138 

C; Superfamily: leukocyte antigen-related protein; fibronectin type III repeat homolog 

ogy 

C;Keywo:ds: glycoprotein; phosphoprotein; phosphoric monoester hydrolase; transmembra 

F;149-209/Domain: immunoglobulin homology <IMM1> 

F;246-300/Doiain: immunoglobulin homology <IMM2> 

F;413-506/Domain: fibronectin type III repeat homology <3FR> 

F;1288-1907/Domain: leukocyte common antigen cytosolic domain homology <LAC> 

F;1375-1596/Domain: protein-tyrosine-phosphatase homology <PTP1> 

F;1664-1887/Domain: protein-tyrosine-phosphatase homology <PTP2> 

F;1548/Active site: Cys (phosphocysteine intermediate) (fstatus predicted 

F;1554/Binding site:; substrate phosphate (Arg) istatus predicted 

F;1839/Active site: Cys (phosphocysteine intermediate) Istatus predicted 

F; 1845/Binding site: substrate phosphate (Arg) fstatus predicted 



Query Match 14.3%; Score 326; DB 2; Length 1907; 

Best Local Similarity 24. 1»; Pred. No. Ue-ll; 

:onservative 68; Mismatches 189; Indels 152; Gaps 24; 



Matches 


Qy 


1 


Db 


144 


Qy 


61 


Db 


199 


Qy 


118 


Db 


254 


Qy 


178 


Db 


306 


Qy 


238 


Db 


355 


Qy 


297 


Db 


411 


Qy 


335 


Db 


458 


Qy 


368 


Db 


518 


Qy 


390 


Db 


578 



:| : II hi III I : I |: 



I I I I :| I I : : 
"FLPVDPSASNGRIKQLRSGALQIESSE 1 



hill? II I I : I I I II I 



1:1 hh: I I :: : h hi! || I 



I I: I I: : II I I lh I : I : : I I I I I hi 
- -VIEAVAQIT- - - VKSLPKAPGTPWT ENI ATS ITVTWDSGNPD • ■ PVS YYVI EY 354 



hi :lh 



I I 



II h: II lh I I hi |: 



• • PVQP-- -LPGTELEHYAVEQQENGYDSDSWCPPL 334 

11:1 : I : :l :l I 
EEPVEPNGLIRGYRV-YYTME PEH 457 



-ELEEDD" 

I II: 



■-AISF- 

I : 



--DRVPTPPVRGVASSP- 367 

I : :|l I 



-GQQSTATLTPSPR— EEMQP-- 

I:: I I: |:::| 



■-XFTS" 

II: 



■ - SQRPRPTSP ■ • -FSTDSNTSAALSQSQRPRPTKKHKG 4 3 3 

: : :!::! I I I: III I : II 



RESULT 8 

150212 1 

protein-tyrosine-phosphatase (EC 3.1.3.48) - chicken 
C;Species: Gallus gallus (chicken) 

C;Date: 13-Sep-1996 tsequence_revision 13-Sep-1996 ttext.change 21-Jan-2000 
C; Accession: 150212 
R;Stoker, A.W. 

Mech. Dev. 46, 201-217, 1994 

A; Title: Isoforms of 'a novel cell adhesion molecule-like protein tyrosine phosphatase 



a> Best Available Copy 

Mon Jan 22 13:04:47 2001 us-09-540-245a-19.rpr Page 



A; Reference number: 150212; MCID: 95001563 
A; Accession: 150212 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-1499 <STO> 

A;Cross-references: GB:L32780; NID:g485746; PIDN:AAA64460.1; PID:g485747 

C; Genetics: 

A; Gene: CRYPalphal 

C; Super family: leukocyte antigen -related protein; fibronectin type III repeat homology; 
ogy ■' 

C; Keywords: phosphoprotein; phosphoric monoester hydrolase; tyrosine-specific phosphatas 

F;148-208/Domain: immunoglobulin homology <IMM1> 

F;245-299/Domain: immiAoglobulin homology <IMM2> 

F;317-399/Domain: fibrlnectin type III repeat homology <3FR> 

F; 881-1499/Domain : leukocyte common antigen cytosolic domain homology <LAC> 

F;1257-1479/Domain: protein-tyrosine-phosphatase homology <PTP2> 

F;1141/Active site: Cys (phosphocysteine intermediate) fstatus predicted 

K1147/Binding site: substrate phosphate (Arg) tstatus predicted 
1432/Active site: Cys (phosphocysteine intermediate) fstatus predicted 
1438/Binding site: substrate phosphate (Arg) tstatus predicted 



Query Match 14.34; Score 325; DB 2; Length 1499; 

Best Local Similarity 25.14; Pred. No, le-11; 



Matches 


Qy 


. 1 


Db 


143 


Qy 


61 


Db 


198 


Qy 


118 


Db 


253 


Qy 


178 


Db 


305 


Oy 


238 


Db 


354 


I 


297 




410 


oy 


335 


Db 


457 


oy 


370 


Db 


517 



::| = II I I 1,11 I : I h 



:| I I I I II 



11:11: 



I I I I :| I I : : 
- -FLPVDPSTSNGRIKQLRSGGLQIESSE 197 



11:11 



I II I : : I : I 
-APRFSIL-PVSHEIMPGGNVNITC 252 



I := : I: Mill II I 
• -GRNVLELTDVKDSANYTCVAMSSLG- - - ■ 



ILSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEA 237 
II I I II: I 1:1:: II II 1 : 1 1 



■ -VIEAVAQII - ■ ■ -VKSLPKAPGTPWTETTATSITITWDSGNPD- ■ PVSYYVIEY 353 



: II I:: II II: I IN |: I I : 



P PVQP- - -LPGTELEHYAVEQQENGYDSDSWCPPL 334 

I < Ihl :M:!: | 
•RNVQGRMLSSTTMIIQWEEPVEPNGQIRGYRV-YYTME PDQ 456 



IT: 



■-ELEEDD-- 

1 II: 



--DRVPTPPVRGVASSPAI 369 

I : :N I 



- -PSPREEMQPMLQASPXFTSSQRPR- - 

11:1: ::: I I 



■-PTSPFSTD 412 

II: I: : 



RESULT 9 
158148 

protein - tyros irie- phosphatase, (EC 3.1,3.48) 2B, splice form LAR - rat 
N; Alternate names: leukocyte common antigen -related phosphatase 
C; Species: Rattus norvegicus (jfcrway rat) 

C;Date: 26-Jul-1996 tsequencejfevision 26-Jul-1996 #text_change 20-Jun-2000 
C;Accession: 158148; S46218 

R;Walton, K.M.; Kartell, K.J.; Kwak, S.P.; Dixon, J.E.; Largent, B.L. 
Neuron 11, 387-400, 1993 
A; Title: A novel receptor-type protein tyrosine phosphatase is expressed during neurogei 
A;Reference number: 158148; MCID: 93357030 
A; Accession: 158148 

A; Status; preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 



A;Residues: 1-1501 <WAL> 

A;Cross-references: GB:L19933; NID:g310242; PIDN: AAA42309 . 1; PID:g310243 

A;Note: in Genbank entry RATTYRPHOS, release 113.0, the source is designated as f.-. .. 

R; Zhang, W.R.; Hashimoto, N,; Ahmad, F.; Ding, w.; Goldstein, B.J. 

Biochem. J, 302, 39-47, 1994 

A;Title: Molecular cloning and expression of a unique receptor-like protei:: -!::::: ; " 

A;Reference number: S46216; MUID:94347119 

A; Accession: S46218 

A; Status: translation not shown 

A; Molecule type: mRNA 

A; Residues: 1-1501 <ZHA> 

A;Cross-references: EMBL:L12329; NID:g294573; PIDN:AAC37657.1; PID:g294574 

C; Super family: leukocyte antigen-related protein; fibronectin type III repeat homolog 

ogy 

C;Keywords: alternative splicing; phosphoprotein; phosphoric monoester hydrolase; tyr 
F;47-109/Domain: immunoglobulin homology <IMM1> 
F;149-209/Domain: immunoglobulin homology <IMM2> 
F;246-300/Domain; immunoglobulin homology <IMM3> 
F;413-506/Domain: fibronectin type III repeat homology <3FR> 
F;882-1501/Domain: leukocyte common antigen cytosolic domain homology <LAC> 
F;969-1190/Domain: protein-tyrosine-phosphatase homology <PTP1> 
F;1258*1481/Domain: protein-tyrosine-phosphatase homology <PTP2> 
F;1142/Active site: Cys (phosphocysteine intermediate) flink PTP1 tstatus predicted 
F;1148/Binding site: -substrate phosphate (Arg) tlink PTPl tstatus predicted 
F;1433/Active site: Cys (phosphocysteine intermediate) flink PTP2 tstatus predicted 
F;1439/Binding site: substrate phosphate (Arg) flink PTP2 fstatus predicted 



Query Match 14.24; Score 323.5; DB 2; Length 1501; 

Best Local Similarity 25.44; Pred. No, 1.2e-ll; 



Matches 


Qy 


1 


Db 


144 


Qy 


61 


Db 


199 


Qy 


118 


Db 


254 


Qy 


178 


Db 


306 


Qy 


238 


Db 


355 


Qy 


297 


Db 


411 


Qy 


335 


Db 


458 


Qy 


358 


Db 


518 


Qy 


416 


Db 


576 



:| : IN' I III I : I I: 



I I I I :| II : : 
"FLPVDPSASNGRIKQLRSGALQIESSE 198 



:| I I I I. II I I : I I I II I 



I I 1:1 : : I: 



I: Mil II 



■ 305 



I 1: II: : II I I II: I |:|::| II II Ml 
■-VIEAVAQIT — VKSLPKAPGTPWTENTATSITVTWDSGNPD-- PVSYYVIEY 354 



:!,: : II |:: II II: i I |:| Ml 



I I 



-PVQP - - -LPGTELEHYAVEQQENGYDSDSWCPPL 334 

Ihl : I : :| :| I 
SEPVEPNGLIRGYRV-YYTME PEH 457 



-ELEEDD" 

I II: 



■-DRVPTPPVRGVASSP- 367 
I : :ll I 



I I I 



■ -PTSPFSTDS — NT 415 

II: I : II 



RESULT 10 
S46217 

protein-tyrosine-phosphatase (EC 3.1.3.48) type sigma precursor • 
N;Alternate names: leukocyte common antigen -related phosphatase 



Best Available Copy 

Mon Jan 22 13:04:47 2001 us-09-540-245a-19.rpr Page 6 



C; Species: Rattus norvegicus (Norway rat) 

C;Date; 07-May-1995 tsequence_revision 03-Nov-1995 ttext.change 23-Jul-1999 
C;Accession: S46217; S51174; A49104 

R;Zhang, W.R,; Hashimoto, N.; Ahmad, P.; Ding, w.; Goldstein, B.J. 
Biochem, J, 302, 39-47, 1994 
A; Title: Molecular cloning and expression of a unique receptor-like protein-tyrosine-phc 
A;Reference number: S46216; MUID:94347119 
A; Accession: S46217 

A; Status: nucleic acid sequence not shown 
A;Molecule type: mRNA 
A;Residues; 1-1863 <ZHA> 
A; Cross -references: EMBL:L11587 
R; Goldstein, B.J, 

submitted to the EMBL Data Libijary, February 1993 
A; Reference number: S51174 
A; Accession: S51174 
A; Molecule type: mRNA 
A;Residues: 1-1788, 'G' , 1790-18*3 <GOL> 

A; Cross -references: EMBL:L1158; ; NID:g205134; PIDN: AAC37656 . 1 ; PID:g205135 
R;Yan, H.; Grossman, A.; Wang, H. ; D'Eustachio, P.; Mossie, K.; Musacchio, J.M.; Silveni 

I J. Biol. Chem. 268, 24880-2488^ , 1993 
k; Title: A novel receptor tyroi Lne phosphatase-sigma that is highly expressed in the nei 
^Reference number: A49104; MU J): 94043351 
A;Accession: A49104 

A; status: preliminary; not compared with conceptual translation 



I', 967-1788, V, 1790-1863 <YAN> 



A;Molecule type: nucleic acid 
A;Residues: 1-596, 'R' ,598-603, 
A; Experimental source: brain 

A;Note: sequence extracted froij NCBI backbone (NCBIP :139669) 
C;Superfamily: leukocyte antigen -related protein; fibronectin type III repeat homology; 

ogy 

C'Keynords: alternative splicirg; glycoprotein; phosphoprotein; phosphoric monoester hyc 
F;l-26/Domain: signal sequence Istatus predicted <SIG> 
F;27-1863/Product: prdfein-tyrc sine-phosphatase Istatus predicted <MAT> 
'F;149-209/Domain: immunoglobulin homology <IMM1> 
F;246-300/Domain: immunoglobulin homology <IMM2> 
F;318-400/Domain: fibronectin type III repeat homology <FN3A> 
F;413-499/Domain: fibronectin type III repeat homology <FN3B> 
F;511-592/Domain: fibronectin type III repeat homology <FN3C> 
F;1244-1863/Domain: leukocyte tommon antigen cytosolic domain homology <LAC> 
■F;1331-1552/Domain: protein-tyi osine-phosphatase homology <PTP1> 
F;1504/Active site: Cys (phosplocysteine intermediate) tstatus predicted 
F; 1510/Binding site: substrate phosphate (Arg) Istatus predicted 
F;1795/Active site: Cys (phosphcysteine intermediate) tstatus predicted 
F; 1801/Binding site: substrate phosphate (Arg) tstatus predicted 



Query Match 14. 2«; Score 323.5; DB 2; Length 1863; 

Best Local Similarity 25.4%; Pred. No. l,6e-ll; 



^ Matches 


124; Conservative 62; Mismatches 190; Indels 113; Gc 


ps 


I 


i 


QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 
:;| : II 1 1 III 1 : 1 1: 1 1 1 1 :| 1 1 : : 
KWERTRTATMLCAASGNPDPEITWFKD FLPVDPSASNGRIKQLRSGALQIESSE 


60 


Db 


144 


198 


Qy 


61 


RSDAGYYICQALTVAG- - S ILMAQLEVTDVLTDRPPPI ILCjGPANQTLAVDGTALLKC 
:| 1 1 1 1 II 1 1 : 1 1 1 II 1 : : 1 : 1 
ETDQGKYECVATNSAGVRYSSPANLYVRVRRV--APRFSIL--PMSHEIMPGGNVNITC 


117 


Db 


199 


253 


Qy 


118 


KATGDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASW 
1 1 1:1 : 1:: 1 1 :: : |: Mill II 1 
VAVGSPMPYVKWMQGAEDLTPEDDMPV- - - -GRNVLELTDVKDSANYTCVAMSSLG- ■ - - 


177 


Db 


254 


305 


Qy 


178 


SAVLDVTESGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEA 

11:11: : II 1 1 lh 1 hh:| II II hi 
VIEAVAQIT - - - -VKSLPKAPGTPWTENTATSITVTTOSGNPD-PVSYYVIEY 


237 


Db 


306 


354 


Qy 


238 


FSQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINPKVSVTQXKPQKNNGS-TWA 

hi :| : : II I:: II lh 1 1 hi h 1 1 : 1 
KSKSQDGPYQ-IREDITTTRYSIGGLSPNSEYEIWVSAVN— SIGQGPPSESWTRTGE 


296 


Db 


355 


410 


Qy 


297 


NVPLPPP PVQP - - - LPGTELEHY AVEQQENGYDSDSWCPPL 

1 1 1: : 1 : :| :| 1 


334 



Db 


411 


Qy 


335 


Db 


458 


Qy 


368 


Db 


518 


Qy 


416 


Db 


576 



411 QAPASAPRNVQARMLSATTMIVQWEEPVEPNGLI RGYRV- YYTME - ■ 



-DRVPTPPVRGVASSP-- 367 

I : :|| I 



■ -AISFGQQSTATLTPSPREEMQPMLQASPXFTSSQRPR- - 

I : : !'■ :lhl ::: I II 



--PTSPFSTDS — NT 415 

lh I : II 



RESULT 11 

C54689 'i 

protein-tyrosine-phosphatase (EC 3.1.3.48), receptor type delta, splice form B precur 
N; Alternate names: MPTP delta type B/C 

N;Contains; protein 'tyrosine phosphatase, receptor type delta, splice form C 
C; Species: Mus musculus (house mouse) 

C;Date: 25-Apr-1995:tsequence_revisioo 19-May-1995 ttext_change 12-Feb-1999 
C;Accession: C54689; B54689 

R;Mizuno, K.; Hasegawa, K. ; Katagiri, T.; Ogimoto, M,; Ichikawa, T.; Yakura, H. 
Mol. Cell. Biol. 13, 5513-5523, 1993 

A;Title: MPTP delta, a putative murine homolog of HPTP delta, is expressed in special 

A;Reference number: -A54689; MUID: 93360986 

A; Accession; C54689 ' 

A; Status: preliminary 

A; Molecule type: mRNA 

A;Residues: 1-1894 <MIZ> 

A; Experimental source: brain; splice form B 

A; Note: sequence inconsistent with nucleotide translation 

A;Note: sequence extracted from NCBI backbone (NCBIN:137486, NCBIP:137487) 

A; Accession: 854689'. 

A; Status: preliminary 

A; Molecule type: mRNA 

A;Residues: 1-352, 'H'", 354-535, 'S', 537-601,1002-1894 <MI2> 

A; Experimental source: brain; splice form C 

A;Note: sequence inconsistent with nucleotide translation 

A;Note; sequence extracted from NCBI backbone (NCBIN:136527, NCBIP:136530) 

C; Super family: leukocyte antigen -related protein; fibronectin type III repeat homolog 

ogy 

C; Keywords: alternative splicing; glycoprotein; phosphoprotein; phosphoric monoester 

F;45-107/Domain: immunoglobulin homology <IMMl> 

F;245-299/Domain: immunoglobulin homology <IMM2> 

F;317-399/Domain: fibronectin type III repeat homology <FN3A> 

P ; 1 2 7 8 - 1 8 94 /Domain : ', leukocyte common antigen cytosolic domain homology <LAC> 

F;1652-1874/Domain: 'protein-tyrosine-phosphatase homology <PTP2> 

F;1536/Active site: 'Cys (phosphocysteine intermediate) tstatus predicted 

F;1542/Binding site: substrate phosphate (Arg) tstatus predicted 

F;1826/Active site: 'Cys (phosphocysteine intermediate) tstatus predicted 

F;1832/Binding site': substrate phosphate (Arg) tstatus predicted 



Query Match, •' 13.9%; Score 317; DB 2; Length 1894; 
Best Local Similarity 23.4%; Pred. No. 4e-ll; 



Mat 


hes 


128; Conservative 63; Mismatches 188; Indels 168; Gaps 22; 


Qy 


1 


QIVAQGRTVTFPCETRGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCS- - -VSPTGDLTIT 
::| : II 1 1 III 1 : 1 h 1 II III 
KWERTRTATMLCAASGNPDPEITWFKD FLPVDTSNNNGRIKQLRSESIGALQIE 


57 


Db 


142 


196 


Qy 


58 


NIQRSDAGYYICQALTVAGS-ILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLK 
: III M 1 lh 1 1 1 1 III 1 1 : h : 
QSEESDQGKYECVATNSAGTRYSAPANLYVR RVPPRFSIPPTNHEIMPGGSVNIT 


116 


Db 


197 


251 


Qy 


117 


CKATGDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEAS 
1 1 1 hi : h 1 : h = ::l INN h 1 
CVAVGSPMPYVKWMLGAEDLTPEDDMPI- -GRNVLELNDVR- -QSANYTCVAMSTLG - - - 


176 


Db 


252 


304 


Qy 


177 


WS AVLDVTESGAT ISKNYDLSDLPGPPSK PQVTDVT KNSVTLS WQ PGT PGTLPASAYIIE 
1 1: |: : II II 1 lh 1 hlh! 1 II 1 1 Mh 


236 
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Db 


305 


Qy 


237 


Db 


353 


Qy 


278 


Db 


412 


Qy 

Db 


298 
472 


Qy 


330 


Db 


531 
388 


i 

Db 


568 


Qy 


427 


Db 


628 



• -VIEAIAQIT- - - -VKALPKPPGTPWTESTATSITLTWDSGNPG- ■ PVSYYIIQ 352 



II hi II I : I I I hi 



: l| II 



■-STWAN" 

: I 



-VPLPPPPVQPL- 

II I: I 



472 TT IGNLVPQKTYSVKVLAFT 3 IGDGPLSSDIQVITQT(^7PGQPLNFKAEPESETS I - LLS 530 
PVQTYLHQGLEDELEEDDDRVPTPPVIGVASSPAISFGQQSTATLTPSPREEM 



: I I: 
"EliTORDGDQ-- 



■ PGTELEHYAVEQQENGYDSDS 329 

If I I : I I 



387 

I:: h I 
•GEEQRITIEPGTSYRL 567 



tfASPXFTSS- - *QRP ■ RPTSPFSTDSNTSAALSQSQRPR 425 
L II I:i hi 1 I 1 : :l :: I :| 
fPQ 



QGLKPNSLYYFRLSATSPQG JGASTAEISfl ITMQKPSf PQDISCTSPSSTSILVSWQPP 627 



I :| I 



TDHULK I 

leukocyte antigen-related protlin precursor • human 
N;Alternate names: leukocyte common antl&en homolog 
N;Contains: protein-tyrosine-phosphatase? (EC 3.1.3.48) 
C; Species: Homo sapiens (man) J, 

C;Date: 31-Dec-1991 #sequence_revision l-Dec-1991 ttext_change 22-Jun-1999 
C; Accession: S03841; JL0051 F 
R;Streuli, M. ; Krueger, NX; Hall, l,r| Schlossman, S.P.; Saito, H. 
J. Exp. Med. 168, 1523-1530, 1988 
A; Title: A new member of the immunoglobulin superfamily that has a cytoplasmic region he 
A;Reference number: JL0051; MUID:89035978 
A; Accession: S03841 

A; Status: nucleic acid sequence not shown 
A;Molecule type: mRNA 
A; Residues: 1-1897 <STR> 

A;Cross-references: EMBL:Y00815; NID:g34266; PIDN:CM68754.1; PID:g34267 

C; Genetics: 
^;Gene: GDB:PTPRF; LAR 
^Cross-references: GDB:120138; OMIM:179590 
■pap position: Ip34-lp34 
T; Superfamily: leukocyte antigen -related protein; fibronectin type III repeat homology; 

ogy 

C; Keywords: glycoprotein; phosphoprotein; phosphoric monoester hydrolase; transmembrane 

F;l-16/Domain: signal sequence istatus predicted <SIG> 

F; 17 -1897/Product : leukocyte antigen-related protein Istatus predicted <MAT> 

F;17-1250/Domain: extracellular Istatus predicted <EXT> 

F;37-99/Domain: immunoglobulin homology <IMMl> 

F;139-199/Domain: immunoglobulin homology <IMM2> 

F;236-290/Domain: immunoglobulin homology <IMM3> 

F;308-390/Domain: fibronectin type III repeat homology <FN3A> 

F;403-489/Domain: fibronectin type III repeat homology <FN3B> 

F;501-583/Domain: fibronectin type III repeat homology <FN3C> 

F;596-685/Domain: fibronectin type III repeat homology <FN3D> 

F;698-798/Domain: fibronectin type III repeat homology Istatus atypical <FN3E> 

F;810-893/Domain; fibronectin type III repeat homology <FN3F> 

F;905-989/Domain: fibronectin type III repeat homology <FS3G> 

F;1001-1078/Domain: fibronectin type III repeat homology <FN3H> 

F;1251-1274/Domain: transmembrane Istatus predicted <TMM> 

F;1275-1897/Domain: intracellular Istatus predicted <INT> 

F; 1285 - 1897/Domain : leukocyte common antigen cytosolic domain homology <LAC> 

F;1365-1586/Domain: protein-tyrosine-phosphatase homology <PTPl> 

F;1654-1877/Domaln: protein-tyrosine-phosphatase homology <PTP2> 

F;44-97, 146-197, 243-288/Disulfide bonds : Istatus predicted 

F;107,240,285,711,956/Binding site: carbohydrate (Asn) (covalent) Istatus predicted 

F;1538/Active site: Cys (phosphocysteine intermediate) Istatus predicted 



F;1544/Binding site: substrate phosphate (Arg) Istatus f 
F;1829/Active site; Cys (phosphocysteine intermediate) Istatus predicted 
F;1835/Binding site:, substrate phosphate (Arg) Istatus predicted 



Query Match 13.5*; Score 308.5; DB 1; Length 1897; 

Best Local Similarity 28.7*; Pred. No. 1.3e-10; 



Matches 


Db 


134 


Qy 


61 


Db 


189 


Qy 


120 


Db 


246 


Qy 


173 


Db 


295 


Qy 


233 


Db 


340 


Qy 


292 


Db 


396 


Qy 


334 


Db 


456 



■ Mil l 1111:1 I: 



•I I I I :| I I : : 
• -FLPVDPATSNGRIKQLRSGALQIESSE 188 



II I II I lh I I II 



I I 



I hi 



--KEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSS 172 

II lh h: I : : Hill II 

[/TKEDEMPVGRN VLELSN- -WRSANYTCVAISSL 294 



V h I :: 
•-MIEATAQVT-' 



: II II lh I lllhl I : I : 
■ -VKALPKPPIDLWTETTATSVTLTWDSG- -NSEPVTY 339 



:| I : III h: II I : I I I hi h : I 



I I II II 



:: II 



--YDSDSWCPP-- 

I II II 



■ - LPVQT YLHQGLEDELEEDDDRVPTPPVR ■ - -GVASSPA 368 

II II : I I II h II : II 



RESULT 13 
A56178 

protein-tyrosine-phosphatase (EC 3.1.3.48), receptor type delta precursor - human 
N; Alternate names: protein-tyrosine-phosphatase BPTP-2 
C; Species: Homo sapiens (man) 

C;Date: 03-Oct-1995 lsequence_revision 03-Oct-1995 ttext_change 21-Jan-2000 
C; Accession: A56178; S12052; B44929 

R;Pulido, R.; Krueger, N.X.; Serra-Pages, C; Saito, H.; Streuli, M. 
J. Biol. Chem. 270, 6722-6728, 1995 

A; Title: Molecular characterization of the human transmembrane protein -tyrosine phosp 

ase delta isoforms. ■' 

A;Reference number: A56178; MUID; 95204468 

A; Accession: A56178 ; 

A; Status: preliminary 

A; Molecule type: mRNA 

A; Residues: 1-1912 <?OL> 

A; Cross -references: GB:L38929; NID:g755652; PIDN:AAC41749.1; PID:g755653 
R;Krueger, N.X.; Streuli, M.; Saito, H. 
EMBO J. 9, 3241-3252,' 1990 

AjTitle: Structural diversity and evolution of human receptor-like protein tyrosine p 

A;Reference number: 312049; MUID: 91006018 

A; Accession: S12052 ' 

A; Status ; preliminary 

A; Molecule type: mRNA 

A; Residues: 390-1912 '<KRU> 

A; Cross -references: GB:X54133; NID:g35789; PIDN:CAA38068.1; PID:g35790 
A;Note: the sequence 'from Fig. 5B is inconsistent with that from Fig. 5A in having 56 
R;Adachi, M.; Sekiya, M.; Arimura, Y,; Takekawa, M.; Itoh, p.; Hinoda, Y. ; Imai, K.; 
Cancer Res. 52, 737-740, 1992 

A; Title: Protein-tyrosine phosphatase expression in pre-B cell nalm-6. 

A;Reference number: A44929; MUID: 92119537 

AjAccession: B44929 ' 

A; Molecule type: mRNA 

A;Residues: 1756-1804/C, 1806-1845 <ADA> 

A;Cross-references: GB;S78086; NID:g243545; PIDN: AAB21147 .1; PID:g243546 
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A; Experimental source: pre-B cell NALM-6 

A;Note: sequence extracted from NCBI backbone (NCBIN:78086, NCBIP-.78087) 
A; Note: the authors did not report the entire codon for residue 90 
C; Genetics: 
A; Gene: GDB : PTPRD 

A;Cross-references; GDB: 131384; OMIM:601598 
A; Map position: 9p24-9p24 

C;Superfamily: leukocyte antigen -related protein; fibronectin type III repeat homology; 
ogy 

C; Keywords: glycoprotein; phosphoprotein; phosphoric monoester hydrolase; transmembrane 
F;38-100/Domain: immunoglobulin homology <IMM1> 
F;140-209/Domain: inununoglobul'in homology <IMM2> 
F;250-304/Domain: immunoglobulin homology <IMM3> 
F;711-811/Domain: fibronectin type III repeat homology <3FR> 



F;1293-1912/Domain 
F;1669-1892/Domain 
F;1553/Active site: 



: leukocyte common antigen cytosolic domain homology <LAC> 
: protein -tyrosine-phosphatase homology <PTP2> 
; Cys (phosphocysteine intermediate) tstatus predicted 
F;1559/Binding site: substrate phosphate (Arg) tstatus predicted 
F;1844/Active site: Cys (phosphocyst&ne intermediate) tstatus predicted 
F;1850/Binding site: substrate phosphite (Arg) tstatus predicted 



jQuery Match 13,5%; Score 307.5; DB 2; Length 1912; 

Best Local Similarity 24.1%; Pred. No. 1.5e-:i0; 

130; Conservative i 64; Mismatches {205; Indels 141; Gaps 
j t 



Qy 


1 


Db 


135 


Qy 


52 


Db 


190 


Qy 


110 


Db 


250 


Qy 


170 


Db 


306 


Qy 


230 


Db 


351 


Qy 


278 


Db 

1 


410 
298 


Db 


470 


Qy 


323 


Db 


530 


Qy 


374 



:| : III I III I : I h 



— SVSPT— 51 
1:11, I: I 
■ -FLPTOTSNNNGRIKQLRSESIGGTPIR 189 



III : II I I ll| II: I I I I ::| III i I : 
GALQIEQSEESIJQGKYECVfflNSAGTRrSAPMYVRELREVRRVPPRFSIPPTNHEIMP 



I: : I I I |:| : if 



I I: I I: 
-VIEAIAQIT-- 



11= I Ull:l I I I 
IfiTESTATSIT LTWDSGNPE- -P 350 



I III: 




I II hi II I 



--VSVTQXKPQKNNG-*- STWAN- 297 

t : I :|:: II f : I 

' ~^STT ILVQWKEPEE PNGQIQGy RVYYTMDPTQHVNNWMKH 469 



-VPLPPPPVQPL- 

II I: I 



II II I : I :| :: |: I 



■ - PGTELEHYAVEQQE 322 

III I : I 



■-VRGVASSPAISF--GQ 373 

::|: = I 



:| I II :|: I 
588 RSPQGLGASTAEISARTMQSKPS ' " 



I I I :| :: I :| I :|| 
• -APPQDISCTSPSSTSILVSWQPPPVERQNG 640 



RESULT 14 
S46216 

leukocyte antigen -related protein precursor - rat 
N; Alternate names: leukocyte common antigen homolog 
H;Contains: protein-tyrosine-phosphatase (EC 3.1.3.48) 
C; Species: Rattus norvegicus (Norway rat) 

C;Date: 20-Feb-1995 tsequence_revision 20-Feb-1995 ttext change 23-Jul-1999 
CjAccession: S46216; S23252; A41032; A33154 
R;Zhang, W.R.; Hashimoto, N, ; Ahmad, F.; Ding, W.; Goldstein, B,J. 
Biochem. J. 302, 39-47, 1994 



A;Title: Molecular cloning and expression of a unique receptor-like protein -tyrosine- 

A;Reference number: S46216; M0ID:94347119 

A; Accession: S46216 ; 

A; Status: nucleic acid sequence not shown 

A; Molecule type: mRNA 

A;Residues: 1-1898 <ZHA> 

A; Cross -references: EMBL:L11586; NID:g205132; PIDN:AAC37655,1; PID:g205133 
R;Hashimoto, N.; Zhang, W.R.; Goldstein, B.J. 
Biochem. J. 284, 569-576, 1992 

A; Title: Insulin receptor and epidermal growth factor receptor dephosphorylation by t 

A;Reference number: S23126; MOID:92287069 

A; Accession: S23252 ; 

A; Status: nucleic acid sequence not shown 

A; Molecule typef mRNA 

A;Residues: .1361-1604; 1649-1898 <HAS> • 

R;Pot, D.A.; Woodford, T.A.; Remboutsika, E.;' Haun, R.S.; Dixon, J.E. 
J. Biol, Chem. 266, 19688-19696, 1991 

A; Title: Cloning, bacterial expression, purification, and characterization of the cyt 
A;Reference number: A41032; MUID:92011772 
Accession: A41032 •. 
A;Molecule type: mRNA 

A;Residues: 1035-107 VS\ 1074-1433, ' T ' , 1435 -1638 , 'N', 1640-1642, 'HT', 1645-1898 <P0T> 

A; Cross -references: GB:M60103; NID;g205130; PIDN:AAA41510.1; PID:g205131 

R;Pot, D.A.; Woodford, T.A.; Remboutsika, E.; Haun, R.S,; Dixon, J.E. 

submitted to the Protein Sequence Database, December 1990 , 

A; Reference number: A33154 

A; Accession: A33154- ■ 

A; Molecule type: mRNA 

A;Residues: 1035-1072, 'S', 1074-1433, T, 1435-1638, 'N\ 1640-1642, 'HT', 1645-1898 <P02> 
CComment: Only the first of the two domains homologous with protein-tyrosine-phospha 
C; Super family: leukocyte antigen-related protein; fibronectin type III repeat homolog 
ogy 

C; Keywords: duplication; glycoprotein; phosphoprotein; phosphoric monoester hydrolase 

F;l-27/Domain: (or 1-26) signal sequence tstatus predicted <SIG> 

F;28-1898/Product: (or 27-1898) leukocyte antigen -related protein tstatus predicted •' 

F;28-1251/Domain: (or 27-1251) extracellular tstatus predicted <EXT> 

F;47-109/Domain: immunoglobulin homology <IMM1> 

F;149-209/Domain: immunoglobulin homology <IMM2> 

F;246-300/Domain: immunoglobulin homology <IMM3> 

F;318-400/Domain: fibronectin type III repeat homology <FN3A> 

F;413-499/Domain: fibronectin type III repeat homology <FN3B> 

F;511-593/Domain: fibronectin type III repeat homology <FN3C> 

F;606-695/Domain: fibronectin type III repeat homology <FN3D> 

F;708-799/Domain: fibronectin type III repeat homology <FN3E> 

F;811-895/Domain: fibronectin type III repeat homology <FN3F> 

F;906-990/Domain: fibronectin type III repeat homology <FN3G> 

F;1002-1079/Domain: fibronectin type III repeat homology <FN3H> 

F;1252-1275/Domain: (or 1259-1275) transmembrane tstatus predicted <TMM> 

F; 1276-1898/Domain : intracellular tstatus predicted <INT> 

F;1286-1898/Domain: leukocyte common antigen cytosolic domain homology <LAC> 

F; 1366-1587/Domain : protein-tyrosine-phosphatase homology <PTP1> 

F;1655-1878/Domain: protein-tyrosine-phosphatase homology <PTP2> 

F;54-107, 156-207, 253 -298/Disul fide bonds: tstatus predicted 

P;117, 250,295, 721, 957/Binding site: carbohydrate (Asn) (covalent) tstatus predicted 

F;1539/Active site: Cys (phosphocysteine intermediate) tstatus predicted 

F;1545/Binding site: substrate phosphate (Arg) tstatus predicted 

F;1830/Active site: Cys (phosphocysteine intermediate) tstatus predicted 

F;1836/Binding site: 'substrate phosphate (Arg) tstatus predicted 



Query Match 
Best Local Similarity 



13.4%; 
25.0%; 



Score 305.5; DB 2; 
Pred. No. 1.9e-10; 



Length 1 



Matches 


134; Conservative 58; Mismatches 200; Indels 143; Gc 


pS 24; 


Qy 


l 


QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 


60 


Db 


144 


::' : M 1 1 III 1 : 1 1: II :| 1 1 : : 
KWEKARTATMLCAAGGNPDPEISWFKD FLPVDPASSNGRIKQLRSGALQIESSE 


198 


Qy 


61 


RSDAGYYICOALTVAGS-ILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKA 
II 1 1 1 1 II: 1 1 1 1 II I::| : 1 1 1 1 
ESDQGKYECVATNSAGTRYSAPANLYVR- - -VRRVAPRFSIPPSSQEVMPGGNVNLTCVA 


119 


Db 


199 


255 


Qy 


120 


TGDPLPVISWL KEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSS 


172 
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I hi : I: I! !h 
256 VGAPMPYVKWMMGAEELTKEDEMPVGRN" 



I:: I : : lllll II 
- -VLELSN- -VMRSANYTCVAISSL 304 



Qy 173 GEASWSAVLDVTESGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASA 232 

I :':!:: : II II II: I lllhl I III 

Db 305 G MIEATAQVT— -VKALPRPPIDLWTETTATSVTLTWDSG--NTEPVSF 349 

Qy 233 YIIEAFSQSVSNSWQTVANHVKTTLYDVRGLRPNTIYLFMVRAINPKVSVTQXKPQKN-N 291 

I I: : :| I : I :| |:: II I : I I I 1=1 I: : I : 
Db 350 YGIQYRAAGTDGPFQEV-DGVASTRYSIGGLSPFSEYAFRVLAVN— SIGRGPPSEAVF 405 

Qy 292 GSTWANVPLPPP • PYQP • • • LPGTELEHYAVEQQENG YDSDS W— 330 

I I II 111 I I : :: II I II I 

Db 406 ARTGEQAPSSPPRRVQARMLSASTMLVQWEPPEEPNGLVRGYRVYYTPDSRRPLSAWHKH 465 

Qy 331 CPPLP - VQTYLHQGLEDE LEEDD 352 

II I : i II: II 
466 NTDAGLLTTVGSLLPGITYSLRVLAFTAVGDGPPSPTIQVKTQQGVPAQPADFQAKAESD 525 



389 



I 

W! 353 DRVP— -TPPVRGVASSPAISF GQQSTATITPSPR— EEMQP- 

I: II : : : III I h |:::| 
Db 526 TRIQLSWLLPPQERIIKYELVYWAAEDEGQQHKVTFDPTSSnLEDLRPDILYHFQLAAR 585 

Qy 390 MLQASPXFTSSQRPRPTSPFSTDSNTSAALSQSQRPRPTKKHKG 433 

: : I I I: : II I I : I II I 
Db 586 SDLGVGVFTPTVEACTAQSTPSAPPQKVTCVSTGSTT - - -VRVSWVPPPADSRNG 637 



RESULT 15 

T43027 j 
neural cell adhesion molecule LI • goldfish 
N; Alternate names: E587 antigen 
C; Species: Carassius auratus (goldfish) 

C;Date: ll-Jan-2000 tgsquencejrevision ll-Jan-2000 ttext_change 04 -Mar-2000 
C; Accession: T43027 

R;Giordano, S.; Laessing, U.; Lottspeich, F.; Stuermer, C.A.O. 
submitted to the EMBL Data Library, April 1996 
A; Description: Molecular cloning of goldfish E587 antigen, a cell adhesion molecule expi 
A; Reference number: Z22294 
A; Accession: T43027 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-1232 <GIO> 

A;Cross-references: EMBL:U55211; NID:gl305526; PID:gl305527; PIDN: AAA99159 .1 
C ,* Superf amily : neural cell adhesion molecule LI; fibronectin type III repeat homology, 
^Keywords: cell adhesion; membrane protein 

Query Match 13,01; Score 297.5; DB 2; Length 1232; 

Best Local Similarity 24.9%; Pred. No. 3.4e-10; 

Matches 111; Conservative 67; Mismatches 184; Indels 83; Gaps 17; 

Qy 5 QGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDA 64 

:|:| I I |:||| I II I I I:: I : |:| |::: |: 

Db 409 EGQTVLLQCRTFGSPQPKVDWQITNS GPALANAKMSQTSDGNLQISDVSEEDS 461 

Qy 65 GYYICQALTVAGSILAKAQLEVTDVLTDRPPPI-ILQGPANQTLAVDGTALLKCKATGDP 123 

llllll:: : | | : :|:| |:|:|: | 

Db 462 SMYTCSVSTSNMSISAELWLNRTKIVDPPQDLRVLRG DDAVLQCRYTVDH 512 

Qy 124 L---PVISWLKE-GFTFPGRDPRATIQEQGTLQIKNLRISDIGTYTCYATSSSGEASWS 178 

: I I I I: I I : I I : I : I :::: hi hi I 
Db 513 MLKQPTIQWKKDKHKITSSANDDKYTESPDGSLKITDVQMEDSGIYSC EIS 563 

Qy 179 AVLD-VTESGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEA 237 

II h ■ : I III :::: : llllll II I I hll 

Db 564 TKLDSVSATGSIV VLDKPGSPHSLELSEKKERSVTLSWMPGAENNSPISEYVIER 618 

Qy 238 FSQS ■ - VSNSWQTVANHVKTTLYTVRGLRPNT I YLFMVRAINPKVSVTQXKPQKNNGSTW 295 

: I: : : I : I : I I 1 1 : 1 : : : : I : II 
Db 619 KEKQNPGRGHWEEYRRVPQDITHLEIHLQPYSTYHFRVRGVN-GIGMSEPSPPSESYST- 676 



Qy 296 ANVPLPPPPVQP-- 



■ • LPGTELEHYAVEQQENGYDSDSWCPPLPVQT YLH 341 



I I :H : III ::| II I : I 
Db 677 - - ■ PAAKPDMHPENVTSVSTDSNSLVITWQELE QRQFNG PGFKYKIYWR 722 

Qy 342 QGLEDELEEDDDRVPTPPVRGVASSPAISFGQQSTATLTPSPREEMQ PMLQA 393 

I : Mill: hll:: | |:| 

Db 723 QEGDSHWMESSASNPPFIVEGPGTFIPFQIKVQAVNELGAGPEPDAEIGYSGEDLP-LEA 781 

Qy 394 SPXFTSSQRPRPT SPFSTDS 413 

I: :':| II II I 
Db 782 PSSVAVSELNKTTVLVKWSPVSTKS 806 



Search completed: January 22, 2001, 12:27:28 
Job time: 2125 sec 
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GenCore version 4.5 
Copyright (c) 1993 • 2000 Compugen Ltd. 



OM protein • protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence; 

Scoring table: 



^^arched; 



January 22, 2001, 12:29:42 ; Search time 162.41 Seconds 
(without alignments) 
86,298 Million cell updates/sec 

US-09-540-245A-19 f 

2280 

1 QIVAQGRTVTFPCETKGNPQ TSAALSQSQfaRPTKKHKGG 434 

BLOSUM62 

Gapop 10.0 / Gapext 0.5 



88757 seqs, 32294092 residues 

Total number of hits satisfying chosen parameters:- 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0* 
Maximum Match 100% 
Listing first 45 summaries 

SwissProt_39:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution, 

SUMMARIES 



suit 




Query 






NO, 


Score 


Match Length DB 


ID 


1 


309,5 


13,6 


837 1 


NCM2JUMAN 


2 


308.5 


13,5 


1897 1 


PTPFJUMAN 


3 


307.5 


13,5 


1912 1 


PTPDJUMAN 


4 


295.5 


13.0 


2029 1 


LARJ)ROME 


5 


292.5 


12.8 


837 1 


NCM2JOUSE 


6 


284,5 


12.5 


1377 1 


NE01JAT 


\ 


282 


12.4 


1443 1 


NE01_CHICK 




279 


12.2 


1493 1 


NEOlJOUSE 


9 


268 


11.8 


1257 1 


CAMLJUMAN 


10 


267.5 


11.7 


1461 1 


NE01JUMAN 


11 


265.5 


11.6 


1260 1 


CAMLJOUSE 


12 


261.5 


11.5 


1259 1 


CAMLJAT 


13 


259 


11.4 


1447 1 


DCCJIOUSE 


14 


258 


11,3 


1239 1 


NRGJROME 


15 


256 


11.2 


1040 1 


AXOIJOMAN 


16 


251,5 


11,0 


1036 1 


AXOl CHICK 


17 


251 


11.0 


761 1 


NCA2JOMAN 


18 


250.5 


11.0 


1284 1 


NRCA_CHICK 


19 


249 


10.9 


1447 1 


DCCJUMAN 


20 


248 


10.9 


725 1 


NCA2JOUSE 


21 


247.5 


10.9 


1018 1 


CQNTJUMAN 


22 


246.5 


10.8 


1115 1 


NCAlJOUSE 


23 


245 


10.7 


1040 1 


AXOIJAT 


24 


244 


10.7 


1010 1 


CONT.CHICR 


25 


243 


10,7 


2012 1 


DSCAJUMAN 


26 


241.5 


10.6 


1020 1 


CONTJiOUSE 


27 


238.5 


10.5 


853 1 


NCAlJOVIN 


28 


238.5 


10.5 


858 1 


NCA1.RAT 


29 


236.5 


10.4 


848 1 


NCA1JUMAN 


30 


235.5 


10.3 


1906 1 


KMLS.CHICK 


31 


223 


9.8 


1266 1 


NGCA_CHICK 


32 


222,5 


9.8 


898 1 


FAS2JCHAM 


33 


222,5 


9.8 


1142 1 


MYPFJUMAN 



Description 



015394 homo sapien 
P10586 homo sapien 
P23468 homo sapien 
P16621 drosophila 
035136 mus musculu 
P97603 rattus norv 
Q90610 gallus gall 
P97798 mus musculu 
P32004 homo sapien 
Q92859 homo sapien 
P11627 mus musculu 
Q05695 rattus norv 
P70211 mus musculu 
P20241 drosophila 
Q02246 homo sapien 
P28685 gallus gall 
P13592 homo sapien 
P35331 gallus gall 
P43146 homo sapien 
P13594 mus musculu 
Q12860 homo sapien 
P13595 mus musculu 
P22063 rattus norv 
P14781 gallus gall 
060469 homo sapien 
P12960 mus musculu 
P31836 bos taurus 
P13596 rattus norv 
P13591 homo sapien 
P11799 gallus gall 
Q03696 gallus gall 
P22648 schistocerc 
Q14324 homo sapien 



34 


222 


9.7 


1913 


KMLSJUMAN 


Q15746 


homo sapien 


35 


220 


9.6 


1051 


PTK7.CHICK 


Q91048 


gallus gall 


36 


218 


9.6 


811 


FS22_DROME 


P34083 


drosophila 


37 


218 


9.6 


873 


FS21_DROME 


P34082 


drosophila 


38 


215 


9,4- 


1091 


NCA1_CHICK 


P13590 


gallus gall 


39 


215 


9.4 


4393 


PGBMJUMAN 


P98160 


homo sapien 


40 


214.5 


9,4 


1070 


PTK7JUMAN 


Q13308 


homo sapien 


41 


213.5 


9,4 


819 


FGRLCHICK 


P21804 


gallus gall 


42 


210.5 


9.2 


822 


FGR1.HUMAN 


P11362 


homo sapien 


43 


210.5 


9,2. 


822 


FGRlJiOUSE 


P16092 


mus musculu 


44 


210.5 


9.2 


822 


FGR1JAT 


Q04589 


rattus norv 


45 


210.5 


9.2 


3707 


PGBMJOUSE 


Q05793 


mus musculu 



RESULT 
NCM2JUMAN 
ID 
AC 
DT 
DT 



STANDARD; 



PRT; 837 AA. 



N-CAM 2). 



NCM2JUMAN 
015394; 

15-JUL-1998 (Rel. 36, Created) 
15-JUL-1998 (Rel. 36, Last sequence update) 
15-J0L-1998 (Rel. 36, Last annotation update] 
NEURAL CELL ADHESION MOLECULE 2 PRECURSOR ( 
NCAM2 OR NCAM2L 
Homo sapiens (Human). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
[1] 

SEQUENCE FROM N.A. 
TISSUE-BRAIN; 

MEDLINE-97369930; PubMed-9226371; 

Paoloni-Giacobino A., Chen H. , Antonarakis S.E.; 

"Cloning of a novel human neural cell adhesion molecule gene (NCAM2) 

that maps to chromosome region 21q21 and is potentially involved in 

Down syndrome."; 

Genomics 43:43-51(1997). 

-I- FUNCTION: MAY PLAY IMPORTANT ROLES IN SELECTIVE FASCICULATION AND 

ZONE-TO-ZONE PROJECTION OF THE PRIMARY OLFACTORY AXONS. 
-I- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN, 
-I- TISSUE SPECIFICITY: EXPRESSED MOST STRONGLY IN ADULT AND FETAL 
BRAIN, 

-1- SIMILARITY: CONTAINS 5 IMMUNOGLOBULIN -LIKE C2-TYPE DOMAINS. 
-!- SIMILARITY: CONTAINS 2 FIBRONECTIN TYPE III-LIKE DOMAINS. 

This SWISS-PROT entry is copyright. It is produced through a collaborates 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

1; -. 



U75330; 

DR MIM; 602040; 

DR INTERPRO; IPR001777; -, 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 2. 

DR PFAM; PF00047; ig; 5, 

KW Cell adhesion; Transmembrane; Glycoprotein; Repeat; 

RW Immunoglobulin domain; Signal, 



FT 


SIGNAL 


I 


19 


POTENTIAL. 


FT 


CHAIN 


20 


837 


NEURAL CELL ADHESION MOLECULE 2, 


FT 


DOMAIN 


20- 


697 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


698 


718 


POTENTIAL. 


FT 


DOMAIN 


719. 


837 


CYTOPLASMIC (POTENTIAL), 


FT 


DOMAIN 


35 


100 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


129 


193 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


225 


288 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


315 


387 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


415; 


481 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


482' 


581 


FIBRONECTIN TYPE-III. 
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FT 


DOMAIN 


594 


678 


FIBRONECTIN TYPE-III 




FT 


DISULFID 


42 


93 


PROBABLE, 




FT 


DISULFID 


136 


186 


PROBABLE, 




FT 


DISULFID 


232 


281 


PROBABLE. 




FT 


DISULFID 


322 


380 


PROBABLE. 




FT 


DISULFID 


422 


475 


PROBABLE , 








177 


177 


N-LINKED (GLCNAC. . 


) (POTENTIAL) 


FT 


CARBOHYD 


219 


219 


N-LINKED (GLCNAC. . 


) (POTENTIAL) 


FT 


CARBOHYD 


309 


309 


N-LINKED (GLCNAC. . 


) (POTENTIAL) 


FT 


CJARBOHYD 


406 


406 


N-LINKED (GLCNAC. . 


) (POTENTIAL) 


FT 


CARBOHYD 


419 


419 


N-LINKED (GLCNAC. . 


) (POTENTIAL) 


FT 


CARBOHYD 


445 * 


445 


N-LINKED (GLCNAC. . 


) (POTENTIAL) 


FT 


CARBOHYD 


474 


474 


N-LINKED (GLCNAC. . 


) (POTENTIAL) 


FT 


CARBOHYD 


562 


562 


N-LINKED (GLCNAC. . 


) (POTENTIAL) 


SQ 


SEQUENCE 


B37 AA; 


92932 


MW; C3D034106C5741C1 CRC64; 



Query Match 13.6 
Best Local Similarity 22. £ 
Matches 106; Conservative 



t 



: ill I |:|:||: 



Score 309.5; DB 1; Length 837; 
Pred. No. l.le-11; 
69; Mismatches 139; Indels 151; Gaps 17; 



5 QGRTOTFPCETKGNPQPAVFwQKEGSQNLLFPNQP- - -QQPNSRCSVSPTGDLTITNIQR 61 



I: h 



h 



224 RGEEMTFSCRASGSPEPAISWFRNG-KLIEENEKYILKGSNT" 



II: II 
--ELTVRNIIN 273 



Qy 62 SDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKATG 121 

II I 1:1:1 II :M: ||:| |:| :| III 

Db 274 SDGGPYVCRATNKAGEDEKQAFLQVF VQPHI IQ - LKNETTYENGQVTLVCDAEG 326 

Qy 122 DPLPVISWLK- -EGFTFPGRD- -PRATIQEQG TLQIKNLRISDTGTYTCVATSSS 172 

:hl hi : :llll I I: i| :| ||::::| :| I II I 
Db 327 EPIPEITWKRAVDGFTFTEGDKSPDGRIEVKGQHGSSSLHIKDVKLSGSGRYDCEAASRI 386 

Qy 173 GEASWSAVLDVTESGATIS-j -. 191 

I I II: : II 

Db 387 GGHQKSMYLDIEYAPKFISNQTIYYSWEGNPINISCDVKSNPPASIHWRRDKLVLPAKNT 446 

Qy 192 — KNYD j J LSDLPG PPSKPQVTD 210 

I I I I:|:| I :: : 

Db 447 TNLKTYSTGRKMILEIAPTSDNDFGRYNCTATNHIGTRFQEYILALADVPSSPYGVKIIE 506 

Qy 211 VTKNSVTLSW-QPGTPGTLPJSAYIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIY 269 



:|: :l : I :| 



I :: 



S. 



I I: I :| I: 



I III I 



Db 507 LSQTTAKVSFNKPDSHGGVPIHHYQVDV-KEVASEIWKIVRSHGVQTMWLNNLEPNTTY 565 

Qy 270 LFMVRAIN PKVSVTQXKPQKNNGSTWANVPLPPPPV— QPLPGTELEHYAVE 319 

I 1:1 I: : I I : I II II I : : 

Db 566 EIRVAAVNGKGQGDYSKIEIFQTLPVRE PSPPSIHGQPSSGKSFKLSITK 615 



320 QQENGYDSDSWCPPLPVQTYL HQGLEDELEEDDDRV 355 

1:1 1:1: I II ::: : I : 

616 QDDGG APILEYIVKYRSKDKEDQWLEKKVQGNKDHI 651 



PTPFJUMAN 


FT 


SIGNAL 


1': 


16 


POTENTIAL, 


ID 


PTPFJUMAN STANDARD; PRT; 1897 AA. 


FT 


CHAIN 


17/ 


1897 


LAR PROTEIN. 


AC 


P10586; 


FT 


DOMAIN 


17' 


1250 


EXTRACELLULAR (POTENTIAL). 


DT 


01-JUL-1989 (Rel. 11, Created) 


FT 


TRANSMEM 


1251 


1274 


POTENTIAL. 


DT 


OWUL-1989 (Rel. 11, Last sequence update) 


FT 


DOMAIN 


1275 


1897 


CYTOPLASMIC (POTENTIAL). 


DT 


01-OCT-2000 (Rel. 40, Last annotation update) 


FT 


DOMAIN 


1360 


1606 


PROTEIN- TYROSINE PHOSPHATASE. 


DE 


LAR PROTEIN PRECURSOR (LEUKOCYTE ANTIGEN RELATED) (EC 3.1.3.48). 


FT 


DOMAIN 


1649' 


1897 


PROTEIN-TYROSINE PHOSPHATASE. 


GN 


PTPRF OR LAR. 


FT 


ACT_SITE 


1538 


1538 


BY SIMILARITY. 


OS 


Homo sapiens (Human). 


FT 


ACT_SITE 


1829 


1829 


BY SIMILARITY. 


OC 


Eukaryota; Metazoa; Chorda ta; Craniata; Vertebrata; Euteleostomi; 


FT 


MUTAGEN 


1538 


1538 


C->S: LOSS OF ACTIVITY. 


OC 


Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 


FT 


CARBOHYD 


107 


107 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


RN 


[1] 


FT 


CARBOHYD 


240; 


240 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


RP 


SEQUENCE FROM N.A. 


FT 


CARBOHYD 


285 


285 


N-LINKED (GLCNAC. , .) (POTENTIAL). 


RC 


TISSUE-TONSIL; 


FT 


CARBOHYD 


711' 


711 


N-LINKED (GLCNAC. . .) (POTENTIAL) , 


RX 


MEDLINE-89035978; PubMed-2972792; 


FT 


CARBOHYD 


956' 


956 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


RA 


Streuli M, , Krueger N.X., Hall L.R., Schlossman S.F., Saito H. ; 


SQ 


SEQUENCE 


1897 AA; 211844 MW; 439850F1D5C031FF CRC64; 



"A new member of the immunoglobulin superfamily that has a 
cytoplasmic region homologous to the leukocyte common antigen."; 



J. Exp. Med. 168:1523-1530(1988). 

[2] ■ ; 
MUTAGENESIS. ■ 

MEDLINE-90046860; PubMed-2554325; 

Streuli M., Krueger N.X., Tsai A.Y.M., Saito H.; 

"A family of receptor -linked protein tyrosine phosphatases in humans 

and Drosophila."; 

Proc. Natl. Acad. Sci. U.S.A. 86:8698-8702(1989). 
[3] 

MUTAGENESIS. 

MEDLINE-90316093; PubMed-1695146; 

Streuli M., Krueger N.X., Thai T., Tang M., Saito H, ; 

"Distinct functional roles of the two intracellular phosphatase like 

domains of the receptor -linked protein tyrosine phosphatases LCA and 

LAR,"; 

EMBO J, 9:2399-2407(1990). 

•I- FUNCTION: IT IS POSSIBLE THAT DLAR IS A CELL ADHESION RECEPTOR. 
IT POSSESSES AN INTRINSIC PROTEIN TYROSINE PHOSPHATASE ACTIVITY 
(PTPASE). f 

-!- FUNCTION: THE FIRST PTPASE DOMAIN HAS ENZYMATIC ACTIVITY, WHILE 
THE SECOND ONE SEEMS TO AFFECT THE SUBSTRATE SPECIFICITY OF THE 
FIRST ONE, 

-I- CATALYTIC ACTIVITY: PROTEIN TYROSINE PHOSPHATE + H(2)0 ■ 

PROTEIN TYROSINE + ORTHOPHOSPHATE . 
-!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 
-!- SIMILARITY: CONTAINS 3 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
-!- SIMILARITY :■ CONTAINS 8 FIBRONECTIN TYPE III-LIKE DOMAINS, 
•!- SIMILARITY: •CONTAINS 2 PROTEIN-TYROSINE PHOSPHATASE DOMAINS, 

This SWISS-PROT -entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation • 
the European Bioinformatics institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license8isb-sib.cn). 

EMBL; Y00815; CAA68754.1; -. 
PIR; S03841; TDHULK. 
HSSP; P18052; 1YFO. 
MIM; 179590; 
INTERPRO; IPR000242; -. 
INTERPRO; IPR000387; ■. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00102; Y_phosphatase; 2, 
PFAM; PF00041; fn3; 7. 
PFAM; PF00047; ig; 3. 
PRINTS; PR00014 : ; FNTYPEIII. 
PRINTS; PR00700; PRTYPHPHTASE , 
PROSITE; PS00383; TYR_PHOSPHATASE_l; 2. 
PROSITE; PS50056; TYR_PHOSPHATASE_2; 2. 
PROSITE; PS50055; TYR_PHOSPHATASE_PTP ; 2, 
Hydrolase; Receptor; Glycoprotein; Signal; Transmembrane; 
Cell adhesion; Immunoglobulin domain; Duplication. 
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Query Match 13.5%; Score 308.5; DB 1; Length 1897; 

Best Local Similarity 28.7%; Pred. No. 3.4e-ll; 

Matches 118; Conservative 47; Mismatches 165; Indels 81; Gaps 17; 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLIFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

::| : I! M III I : I I: I I I I :| I I : : 
Db 134 KWEKARTATMLCAAGGNPDPEISWFKD FLPVDPATSNGRIKQLRSGALQIESSE 188 

Qy 61 RSDAGYyiCQALTVAGS-IIAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKA 119 

I! I I I I II: MM II Ml : hill 
Db 189 ESDQGKYECVATNSAGTRYSAPANLYVR- - -VRRVAPRFSIPPSSQEVMPGGSVNLTCVA 245 

Qy 120 TGDPLPVISWL KEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSS 172 

I 1:1 : I: II II: I:: I : : Mil II 

Db 246 VGAPMPYVKWMMGAEELTKEDEMPVGRN VLELSN- -WRSANYTCVAISSL 294 

Qy 173 GEASWSAVLDVTESGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASA 232 

• I : I: M: : II II II: I lllhl I : I : 

295 G MIEA1AQVT — VKALPKPPIDLWTETTATSVTLTWDSG- -NSEPVTY 339 

Qy 233 YIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINPKVSVTQXKPQKN-N 291 

II:: :| M MM:: II M I I 11:1 |: : i : 
Db 340 YGIQYRAAGTEGPF|EV-DGVATTRYSIGGLSPFSEYAFRVLAVN— SIGRGPPSEAVR 395 

Qy 292 GSTWANVPLPPP-PVQP—LPGTELEHYAVEQQENG YDSDSWCPP 333 

I I II II II : :: II I II II 

Db 396 ARTGEQAPSSPPRRVQARMLSASTMLVQWEPPEEPNGEVRGYRVYYTPDSRRPPNAWHKH 455 



Qy 334 

Db 456 NTDAGLLTTVGS1 



RESULT 3 
PTPDJOMAN 
ID 
AC 
DT 



r QT YLHQGLEDELEEDDDRVPT PPVR - - "GVASSPA 368 
MM I II I: II : II 
ITYSLRVLAFTAVGDGPPSPTIQVKTQQGVPAQPA 506 



PRT; 1912 AA. 



PTPDJUMAN STANDARD; 
P23468; 2 
01-NOV-1991 (Rel. 20, Created) . 
01-OCT-1996 (Rel. 3'4, Last sequence update) 
01-OCT-2000 (Rel, 40, Last annotation update) 

PROTEIN-TYROSINE PHOSPHATASE DELTA PRECURSOR (EC 3.1.3.48) (R-PTP- 

DELTA) . 

PTPRD. 

Homo sapiens (Human) . 

Eukaryota; Metazoa ,1 Chor data ; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutherj^;! Primates; Catarrhini; Hominidae; Homo. 
[1] 

SEQUENCE FROM N.A, 
MEDLINE-95204468; PubMed-7896816; 

Pulido R., Krueger N.X., Serra-Pages C, Saito H, , Streuli M.; 

"Molecular characterization of the human transmembrane 

protein -tyrosine phosphatase delta. Evidence for tissue-specific 

expression of alternative human transmembrane protein-tyrosine 

phosphatase delta isoforms,"; 

J. Biol. Chem. 270:6722-6728(1995). 

[2] 

SEQUENCE OF .390-1912 FROM N.A. 
TISSUE-PLACENTA; 

MEDLINE=91006018; PubMed°2170109; 
Krueger N.X., Streuli M. , Saito H,; 

"Structural diversity and evolution of human receptor -like protein 

tyrosine phosphatases,"; 

EMBO J. 9:3241-3252(1990). J . 

-I- CATALYTIC ACTIVITY: PROTEIN TYROSINE PHOSPHATE + H(2)0 - 

PROTEIN TYROSINE + ORTHOPHQSPHATE. ; 
-!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 
•!- ALTERNATIVE PRODUCTS: DIFFERENT ISOFORMS ARE FOUND IN DIFFERENT 

TISSUES DUE TO, ALTERNATIVE SPLICING. 
-!- PTM: A CLEAVAGE OCCURS THAT SEPARATES THE EXTRACELLULAR DOMAIN 

FROM THE TRANSMEMBRANE SEGMENT. j 
-!- SIMILARITY : CONTAINS 3 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 
■!- SIMILARITY: CONTAINS 8 FIBRONECTIN TYPE III-LIRE DOMAINS. 
-I- SIMILARITY: CONTAINS 2 PROTEIN-TYROSINE PHOSPHATASE DOMAINS. 



CC ,- 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute, There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed, Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

cc 

DR EMBL; L38929; AAC41749.1; -. 

DR EMBL; X54133; CAA38068.1; -. 

DR PIR; S12052; S12052. 

DR HSSP; P18052; 1YFO. 

DR MIM; 601598; -. , 

DR INTERPRO; IPR000242; 

DR INTERPRO; IPR000387; 

DR INTERPRO; IPR001777; ■ 

DR INTERPRO; IPR0Q3.006; 

DR PFAM; PF00102; Y_phosphatase; 2. 

DR PFAM; PF00041; £n3; " 

DR PFAM; PF00047; lg; 3, 

DR PRINTS; PR0Q014; FNTYPEIII, 

DR PRINTS; PR00700; PRTYPHPHTASE. 

DR PROSITE; PS00383; TYR_PHOSPHATASE_l; 2. 

DR PROSITE; PS50056; TYR_PHOSPHATASE_2; 2. 

DR PROSITE; PS50055; TYR_PHOSPHATASE_PTP ; 2, 

KW Hydrolase; Receptor; Glycoprotein; Signal; Transmembrane; Duplication; 
Immunoglobulin domain; Alternative splicing. 



FT 


SIGNAL 


1 


20 


POTENTIAL. 


FT 


CHAIN 


21 


1912 


PROTEIN-TYROSINE PHOSPHATASE DELTA, 


FT 


DOMAIN 


21 


1265 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


1266.. 


1290 


POTENTIAL. 


FT 


DOMAIN 


1291 •' 


1912 


CYTOPLASMIC (POTENTIAL) . 


FT 


DOMAIN 


23 


115 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


118 


225 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


232 1 


318 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


320' 


414 


FIBRONECTIN TYPE- III, 


FT 


DOMAIN 


417 


513 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


516' 


606 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


609', 


708 


FIBRONECTIN TYPE- III, 


FT 


DOMAIN 


711 ■• 


822 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


825' 


916 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


918, 


1017 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


1020 


1137 


FIBRONECTIN TYPE-III, 


FT 


DOMAIN 


1375' 


1618 


PROTEIN-TYROSINE PHOSPHATASE. 


FT 


DOMAIN 


1619. 


1912 


PROTEIN-TYROSINE PHOSPHATASE. 


FT 


ACTJITE 


1553V 


1553 


BY SIMILARITY, 


FT 


ACTJ3ITE 


1844 


1844 


BY SIMILARITY, 


FT 


SITE 


1175, 


1178 


CLEAVAGE (POTENTIAL) . 


FT 


CARBOHYD 


254' 


254 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


299' 


299 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


724; 


724 


N-LINKED (GLCNAC. , ,) (POTENTIAL). 


FT 


CARBOHYD 


832! 


832 


N-LINKED (GLCNAC. , .) (POTENTIAL). 


FT 


VARSPLIC 


181 - 


189 


MISSING (IN KIDNEY ISOFORM). 


FT 


VARSPLIC 


226' 


229 


MISSING (IN KIDNEY ISOFORM). 


FT 


VARSPLIC 


775' 


783 


MISSING (IN KIDNEY ISOFORM). 


FT 


VARSPLIC 


609,' 


1137 


MISSING (IN FETAL BRAIN ISOFORM). 


FT 


MUTAGEN 


1178v 


1178 


R->A: 2.5-FOLD REDUCTION IN CLEAVAGE. 


so 


SEQUENCE 


1912 'AA; 214759 MW; 3AE8CBCD32182E26 CRC64 ; 



Query Match '< 13.5%; Sco^e 307.5; DB 1; Length 1912; 
Best Local Similarity 24.1%; Pred. No. 4e-ll; 

Matches 130; Conservative 64; Mismatches 205; Indels 141; Gaps 22; 



Qy 1 QIVAQGRTVT.FPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRC SVSPT--- 

vr. ,::| : II I; I III M I I: I II I: I 
Db 135' KWERTRTATMLCAASGNPDPEITWFKD FLPVDTSNNNGRIKQLRSESIGGTPIR 



51 



Qy 52 GDLT ITNIQR3DAGY YICQALTVAGS - ILAKAQLEVTDVL -TDRPPPI I LQGPANQTLAV 109 

III : 111 I I I II: MIM: I I : 

Db 190 GALQIEQSEESDQGKYECVATNSAGTRYSAPANLYVRELREVRRVPPRFSIPPTNHEIMP 249 



Mon Jan 22 13:04:48 2001 



us-09-540-245a-19.rsp 



Page 4 



i 



110 DGIALLKCKATGDPLPVISWLKEGFIFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVAT 169 

I: : I I I 1:1 : I: I : I:: ::| : Mil! 

250 GGSVNITCVAVGSPMPYVKWMLGAEDLTPEDDMPI - -GRNVLELKDVR- -QSANYTCVAM 305 

170 SSSGEASWSAVLDVTESGATISRNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLP 229 

I: I I h I I: : II II I lh I MM I I I 

306 STLG VIEAIAQIT-— VKALPKPPGTPWTESTATSITLTWDSGNPE-P 350 

230 ASAYIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAIN 277 

I III: :: : : : : I 1 1 I : I 1 1 I : I I I M 
351 VS YYI IQHKPKNSEELYKEI -D^VATTRYSVAGLSPYSDYEFRVVAVNNIGRGPPSEPVL 409 

278 PK VSVTQXKPQKNNG STWAN- 297 

I: : I :|:: II : I 

410 TQTSEQAPSSAPRDVQAMSSTTILVQWKEPEEPNGQIQGYRVYYTMDPTQHVNNWMKH 469 

298 VPLPPPPVQPL PGTELEHYAVEQQE 322 

II I: I IIIIM 
470 NVADSQITTIGNLVPQKTYSVKVLAFTSIGDGPLSSDIQVITQTGVPGQPLNFKAEPESE 529 

323 NGYDSDSWCPPLPVQTYLHQGL--EDELEEDDDRVPTPP VRGVASSPAISF--GQ 373 

II II I : I :l :: I: I ::|: : I 
530 TS I -LLSWTPPRS - DTIANTELVYKDGEHGEEQRITIEP6TS YRLQGLKPNSLYYFRIAA 587 

374 QSTATLTPSPREEMQPMLQASPXFTSSQRPRPTSPFSTDSNTSAALSQSQRPRPTKKHKG 433 

:| I I I :|: I I I I : I : : I : I I : I I 

588 RSPQGLGASTAEISARTMOSKPS APPQDISCTSPSSTSILVSWQPPPVEKQNG 640 



RESULT 4 
LAR.DROME 

ID LARJROME STANDARD; PRT; 2029 AA, 

AC P16621; 

DT 01-AUG-1990 (ReL 15, Created) 

DT 01-AUG-1990 (ReL 15, Last sequence update) j 

DT 01-OCT-2000 (tel. 40, Last annotation update) 

D£ PROTEIN-TYROSINE PHOSPHATASE DLAR PRECURSOR {(EC 3.1.3.48) ( PROTEIN- 

DE TYROSINE - PHOS PHATE PHOSPHOHYDROLASE) . 

GN LAR, 

OS Drosophila melanogaster (Fruit fly). 

,OC Eukaryota; Metazoa; Arthropoda; Tracheata; Bexapoda; insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila, 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE°90046860; PubMed-2554325; 

RA Streuli M., Krueger N.X., Tsai A.Y.M., Saito H,; 

RT "A family of receptor -linked protein tyrosine phosphatases in humans 

RT and Drosophila."; 

•Proc. Natl. Acad. Sci. U.S.A. 86:8698-8702(1989). 
[2] * 
SEQUENCE FROM N.A. 

RC STRAIN-CANTON - S ; 

RX MEDLINE=96178473; PubMed-8598047; 

RA Krueger N.x., van Vactor D., Wan H.I., Gelbart W.M., Goodman C.S., 

RA Saito H.; 

RT "The transmembrane tyrosine phosphatase DLAR controls motor axon 

RT guidance in Drosophila."; 

RL Cell 84:611-622(1996). 

CC •!* FUNCTION: IT IS POSSIBLE THAT DLAR IS A CELL ADHESION RECEPTOR. 

CC IT POSSESSES AN INTRINSIC PROTEIN TYROSINE PHOSPHATASE ACTIVITY 

CC (PTPASE). IT CONTROLS MOTOR AXON GUIDANCE. 

CC -I- CATALYTIC ACTIVITY: PROTEIN TYROSINE PHOSPHATE + H(2)0 - 

CC PROTEIN TYROSINE + ORTHOPHOSPHATE. 

CC -!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN, 

CC -!- TISSUE SPECIFICITY: SELECTIVELY EXPRESSED IN A SUBSET OF AXONS AND 

CC PIONEER NEURONS IN THE EMBRYO. 

CC ■!- SIMILARITY: CONTAINS 3 IMMUNOGLOBULIN-LIKB C2-TYPE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 16 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 2 PROTEIN-TYROSINE PHOSPHATASE DOMAINS. 



CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Osage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC - 



EMBL 


M27700 


AAA28668.1 




EMBL 


U36857 


AAC47002.1 




EMBL 


U36849 


AAC47002.1 


JOINED. 


EMBL 


U36850 


AAC47 002.1 


JOINED. 


EMBL 


U36851 


AAC47002.1 


JOINED. 


EMBL 


U36852 


AAC47002.1 


JOINED. 


EMBL 


U36853 


AAC47002.1 


JOINED. 


EMBL 


U36854 


AAC47002.1 


JOINED, 


EMBL 


U36855 


AAC47002.1 


JOINED. 


EMBL 


U36856 


AAC47002.1 


JOINED. 


PIRf A36182; 


TDFFLK. 




HSSP; P28827 


1RPM. 





FLYBASE; FBgn0000464; Lar. 
INTERPRO; IPR000242; -. 
INTERPRO; IPR000387; -. 
INTERPRO; IPR001777; -, 
INTERPRO; IPR003006; 
PFAM; PF00102; Y_phosphatase; 2. 
PFAM; PF00041; fn3; 9. 
PFAM; PF00047; ig; 3. 
PRINTS; PR00014; FNTYPEIII, 
PRINTS; PR00700; PRTYPHPHTASE, 
PROSITE; PS00383; TYR_PH0SPHATASE_1; 2. 
PROSITE; PS50056; TYR_PHOSPHATASE_2 ; 2. 
PROSITE; PS50055; T YR J>H0SPHAT ASE_PT P ; 2. 
Hydrolase; Receptor; Glycoprotein; Signal; Transmembrane; 
Cell adhesion; Immunoglobulin domain; Duplication. 



FT 


SIGNAL 


V 


32 






FT 


CHAIN 


33' 


2029 


PROTEIN-TYROSINE PHOSPHATASE DLAR, 


FT 


DOMAIN 


33 


1377 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


1378 


1402 


POTENTIAL. 




FT 


DOMAIN 


1403 


2029 


CYTOPLASMIC (POTENTIAL), 


FT 


DOMAIN 


50' 


118 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


154 


216 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


249 


308 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


320: 


417 


FIBRONECTIN TYPE 


III. 


FT 


DOMAIN 


418; 


512 


FIBRONECTIN TYPE 


III. 


FT 


DOMAIN 


513, 


607 


FIBRONECTIN TYPE 


III. 


FT 


DOMAIN 


608 


706 


FIBRONECTIN TYPE 


Ill, 


FT 


DOMAIN 


707 


809 


FIBRONECTIN TYPE 


III, 


FT 


DOMAIN 


810> 


906 


FIBRONECTIN TYPE 


III. 


FT 


DOMAIN 


907. 


1007 


FIBRONECTIN TYPE 


Ill, 


FT 


DOMAIN 


1008' 


1102 


FIBRONECTIN TYPE 


III. 


FT 


DOMAIN 


1103 


1207 


FIBRONECTIN TYPE 


III. 


FT 


DOMAIN 


1492, 


1738 


PROTEIN-TYROSINE 


PHOSPHATASE. 


FT 


DOMAIN 


1781' 


2029 


PROTEIN-TYROSINE 


PHOSPHATASE. 


FT 


ACTJITE 


1670/ 


1670 


BY SIMILARITY, 




FT 


ACTJITE 


1961' 


1961 


BY SIMILARITY . 




FT 


DISULFID 


57: 


111 


POTENTIAL. 




FT 


DISULFID 


161- 


209 


POTENTIAL. 




FT 


DISULFID 


256 


301 


POTENTIAL. 




FT 


CARBOHYD 


176 


176 


N-LINKED (GLCNAC 


. .) (POTENTIAL). 


FT 


CARBOHYD 


253 


253 


N-LINKED (GLCNAC 


. .) (POTENTIAL). 


FT 


CARBOHYD 


298' 


298 


N-LINKED (GLCNAC 


. .) (POTENTIAL). 


FT 


CARBOHYD 


553' 


553 


N-LINKED (GLCNAC 


. .) (POTENTIAL), 


FT 


CARBOHYD 


616 


616 


N-LINKED (GLCNAC 


. .) (POTENTIAL). 


FT 


CARBOHYD 


666 : 


666 


N-LINKED (GLCNAC 


. .) (POTENTIAL). 


FT 


CARBOHYD 


721' 


721 


N-LINKED (GLCNAC 


. .) (POTENTIAL). 


FT 


CARBOHYD 


774 : 


774 


N-LINKED (GLCNAC 


. .) (POTENTIAL). 


FT 


CARBOHYD 


' 915' ■ 


915 


N-LINKED (GLCNAC 


. .) (POTENTIAL). 


FT 


CARBOHYD 


962, 


962 


N-LINKED (GLCNAC 


. .) (POTENTIAL). 


FT 


CARBOHYD 


1183. 


1183 


N-LINKED (GLCNAC 


. .) (POTENTIAL). 


FT 


CARBOHYD 


1304 '■ 


1304 


N-LINKED (GLCNAC 


. .) (POTENTIAL). 


SQ 


SEQUENCE 


2029 AA; 229027 


MW; 536AOC794D3DC800 CRC64; 


Query Match 




13.01; 


Score 295,5; DB 


1; Length 2029; 



Best Available Copy 

Mon Jan 22 13:04:48 2001 us-09-540-245a-19.rsp Page 5 



Best Local Similarity 23.3%; Pred. No, 2.3e-10; 

Matches 127; Conservative 66; Mismatches 180; Indels 173; Gaps 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSV-SPTGDLTITNI 59 

III :| I :hl I:: hi I : II :| I ::| I 

Db 45 QGVRVGGVASFYCAARGDPPPSIVWRKNG KKVSGTQSRYTVLEQPGGISILRI 97 

Qy • 60 Q RSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPP — PHLQGPANQTLAVDG 111 

: I II U I :: I I I : : I: I |:| III : : I 
Db 98 EPVRAGRDDAPY'ECVAENGVGDAVSADMLTIYE-GDKTPAGFPVIIQGPGTRVIEVGH 155 

Qy 112 TALLKCKATGDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSS 171 

I I: III hi I I hi =11 :::: I llhl I I I I III :| 

Db 156 IVLMTCKAIGNPTPSIYWIKNQTKVDMSNPRYSLKD-GFLQIENSREEDQGKYECVAENS 214 



I 



172 SG-EASWSAVLDV- TESGATISKNYDLS 197 

IIIMI I I : I ill 

215 MGTEHSKATNLYVKVRRVPPTFSRPPETISEVMLGSNLNLSCIAVGSPMPHVKWMKGSED 274 

198 DLPGPPSKPQVTDVT 212 

II I: l:::ll 

275 LTPENEMPIGRNVLQLINIQESANYTCIAASTLGQIDSVSWKVQSLPTAPTDVQISEVT 334 

213 KNSVTLSWQPGTPGTLPASAYI IEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNT IYLFM 272 

II I I I I hi: : : I I II I I I II 

335 ATSVRLEWSYKGPEDL--QYYVIQYKPKNANQAFSEISG-JITMYYWRALSPYTEYEFY 391 



Qy 273 VRAINPKVSVTQXKPQKNNGSTWANVPLPPP- - PVQPLPG- TELEHYAVEQQENGYDSD- 328 

I hi hill I h:| I I 

Db 392 VIAVN NIGRGPPSAPATCTTGETKMESAPRNVQVRTLSSST 432 

Qy 329 — SWCPPLPVQTYLHQGLEDELEEDDDRVPTPPVRGV ASSPAISFGQQ — 374 

:| II III :: I h I 

Db 433 MVITWEPP ETPNGQVTGYKVYYTTNSNQPEASWNSQMVDN 472 

Qy 375 — STATLTP — SPREEMQPMLQASPXFT — SSQRPRPTSPF — STDSNTSAAL 419 

: : :H : I : : I I I :|: h I :|| :| 
Db 473 SELTTVSDVTPHAIYTVRVQAYTSMGAGPMSTPVQVKAQQGVPSQPSNFRATDIGETAVT 532 



I Qy 420 SQSQRP 425 

I :| 

Db 533 LQWTKP 538 



KIiOUJ 



5 

'JQUSE 

NCM2JOUSE STANDARD; PRT; 837 AA. 
035136; 035962; 1 
15-JUL-1998 (Rel. 36, Created) 
15-JUL-1998 (Rel. 36, Last sequence update) 
15-JUL-1998 (Rel. 36, Last annotation update) 
NEURAL CELL ADHESION MOLECULE 2 PRECURSOR (N-CAM 2) (RB-8 NEURAL CELL 
ADHESION MOLECULE) (R4B12). 
NCAM2 OR OCAM OR RNCAM. 
Mus musculus (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus, 
[1] 

SEQUENCE FROM N.A. (LONG AND SHORT FORMS). 
STRAIN=BALB/C; TISSUE=OLFACTORY NEUROEPITHELIUM; 
MEDLINE-97368238; PubMed-9221781; 
Yoshihara Y., Kawasaki M., Tamada A., Fujita H., Hayashi H., 
Kagamiyama H,, Mori K.; 

"OCAM: A new member of the neural cell adhesion molecule family 
related to zone-to-zone projection of olfactory and vomeronasal 
axons."; 

J. Neurosci. 17:5830-5842(1997). 
(2) 

SEQUENCE FROM N.A. (SHORT FORM). 
STRAIN-C57BL/6J; TISSUE-OLFACTORY EPITHELIUM; 
MEDLINE-97476194; PubMed-9334170; 
Alenius M., Bohm S.; 

"Identification of a novel neural cell adhesion molecule-related gene 





with a potential role in selective axonal projection."; 


RL 


J. Biol. Chem.. 272:26083-26086(1997). 


CC 


•!• FUNCTION: MAY PLAY IMPORTANT ROLES IN SELECTIVE FASCICULATION AND 




ZONE-TO-ZONE PROJECTION 


)F THE PRIMARY OLFACTORY AXONS. 


CC 


-!- SUBCELLULAR (LOCATION: TYPE I MEMBRANE PROTEIN (LONG FORM) AND 


CC 


ATTACHED TO'.TH 


E MEMBRANE 


BY A GPI-ANCHOR (SHORT FORM), 




-!- TISSUE SPECIFICITY: EXPR 


ZSSED IN SUBSETS OF BOTH OLFACTORY AND 


rr 


VOMERONASAL NEURONS IN A 


ZONE-SPECIFIC MANNER. 


CC 


-1- SIMILARITY: .CONTAINS 5 IMMUNOGLOBULIN- LIKE C2-TYPE DOMAINS. 


CC 
CC 
CC 


■!- SIMILARITY:; CONTAINS 2 FIBRONECTIN TYPE III-LIKE DOMAINS. 


This SWIS 


S-PROT entry is copyright. It is produced through a collaboration 


CC 


between 


the Swiss Institute of Bioinformatics and the EMBL outstation - 


CC 


the European Bioinformatics Institute. There are no restrictions on its 




use by 


non-profit institutions as long as its content is in no way 


CC 


modified and this statement is not removed. Usage by and for commercial 




entities requires a license agreement (See http://www.isb-sib.ch/announce/ 


CC 
CC 
DR 


or send an email to license8isb-sib.cn). 


EMBL; AF001287; AAB69125.1; 




DR 
DR 


EMBL; AF001286;.AAB69124.1; 
EMBL; AF016619; 'AAC53375 .1; 
MGD; MGM095738; OCAM. 




DR 


INTERPRO 


IPR001777; -. 






INTERPRO; IPR003006; -. 




no 


PFAM; PF00041; fn3; 2. 




no 


PFAM; PF00047; ig; 5. 




rw 

z. 


Cell adhesion; Transmembrane 


Glycoprotein; Repeat; 


RW 


Immunoglobulin domain; Signal; GPI-anchor; Alternative splicing. 




SIGNAL 


1 


19 


POTENTIAL. 


FT 


CHAIN 


20' 


837 


NEURAL CELL ADHESION MOLECULE 2. 




DOMAIN 


20 


697 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


698 


718 


POTENTIAL. 


FT 


DOMAIN 


719 


837 


CYTOPLASMIC (POTENTIAL), 




DOMAIN 


35/ 


100 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


129' 


193 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


225 


288 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


315 


387 


IG-LIKE C2-TYPE DOMAIN. 




DOMAIN 


415' 


481 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


482', 


581 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


594 


678 


FIBRONECTIN TYPE-III. 




DISULFID 


42- 


93 


PROBABLE. 


PT 


DISULFID 


136- 


186 


PROBABLE. 


FT 


DISULFID 


232 


281 


PROBABLE, 




DISULFID 


322' 


380 


PROBABLE. 


FT 


DISULFID 


422' 


475 


PROBABLE, 




CARBOHYD 


177) 


177 


N-LINKED (GLCNAC. . .) (POTENTIAL). 




CARBOHYD 


219, 


219 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


cm 

FT 


CARBOHYD 


309: 


309 


N-LINKED (GLCNAC. , .) (POTENTIAL). 




CARBOHYD 


406- 


406 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


419! 


419 


N-LINKED (GLCNAC, , .) (POTENTIAL) . 


FT 


CARBOHYD 


445. 


445 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


474- 


474 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


562. 


562 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


VARSPLIC 


694; 


837 


TLFNGLGLGAIIGLGVAALLLILWTDVSCFFIRQCGLLMC 


FT 








ITRRMCGKKSGSSGKSKELEEGKAAYLKDGSKEPIVEMRTE 


FT 








DERITNHEDGSPVNEPNETTPLTEPEKLPLKEENGKEVLNA 


FT 








ETIEIKVSNDI IQSKEDDIKA -> NCCEANKGENGGQSWH 


FT 








LNAVGFTFVITMSLSCLF (IN GPI -ANCHORED 


FT 








ISOFORM) . 


SQ 


SEQUENCE 


837 AA; 


93203 MW 


70473B053A2D65A5 CRC64; 



Query Match , 12.84; Score 292.5; DB 1; Length 837; 
Best Local Similarity 21,7%; Pred, No. l,2e-10; 
, . Matches 101; Conservative 74; Mismatches 139; Indels 151; Gaps 17; 

Qy 5 QGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQP— QQPNSRCSVSPTGDLTITNIQR 61 

:| :l l: ; hi I : I : I h h : h =lh II 

Db 224 RGEEMTLTCKASGSPDPT ISWFRNG - - KLIEENEKYILKGSNT ELTVRNIIN 273 

Qy 62 SDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKATG 121 

I I hhl' II :| hi I III hi : :! I hi I 



Mon Jan 22 13:04:48 2001 



us-09-540-245a-19,rsp 



Page 



Db 274 KDGGSYVCKATNKAGEDQKQAFLQVF VQPHILQ - LKNETTSENGHVTLVCEAEG 326 

Qy 122 DPLPVISWLK- -EGFTFPGRD- -PRATIQEQG TLQIKNLRISDTGTYTCVATSSS 172 

:hl hi : :| I II |: :| :| |:::::||:| I I I I 
Db 327 EPVPEITWRRAIDGVMFSEGDRSPDGRIEVKGQHGRSSLHIRDVKLSDSGRYDCEAASRI 386 

Qy 173 GEASWSAVLDVTESGAT IS KN- 193 

I I II: : :| II 
Db 387 GGHQRSMHLDIEYAPKFVSNQTMYYSWEGNPINISCDVTANPPASIHWRREKLLLPAKNT 446 

I QY 194 YDLSDLPGPPSKPQVTD 210 

I :|:|:| I : 

Db 447 THLKTHSVGRKMILEIAPTSDNDFGRYNCTATNRIGTRFQEYILEIADVPSSPHGVKIIE 506 

Qy 211 VTRNSVTLSW-QPGTPGTLPASAYIIEAFSQSVSNSWQTVAtlHVKTTLYTVRGLRPNTIY 269 

: :|: :| : I :| I :: : | :|: | :| |: : | ||| | 
Db 507 LSQTTAKISFNKPESHGGVPIHHYQVDV-KEVASEIWKIVRSHGVQTMWLSSLEPNTTY 565 

Qy *270 LFMVRAIN PKVSVTQXK PQKNNG STWANVPLPPPPV - - -QPLPGTELEHYAVE 319 

I hi * h : I I : I II II I : : 

Db 566 EIRVAAVNGRGQGDYSRIEIFQTLPVRE PSPPSIHGQPSSGKSFKISITK 615 

H 320 QQENG YDSDSWC PPLPVQTYL HQGLEDELEEDDDRV 355 

W I : I hi: I II ::: : I : 

Db 616 QDDGG APILEYIVKYRSKDKEDQWLEKKVQGNKDHI 651 



RESULT 6 
NE01.RAT 

ID NE01_RAT STANDARD; PRT; 1377 AA. 
P97603; 

01-OCT-2000 (Rel. 40, Created) 
01-OCT-2000 (Rel. 40, Last sequence update) 
01-OCT-2000 (Rel. 40, Last annotation update) 
NEOGENIN PRECURSOR (FRAGMENT). 
NEOl OR NGN, 

Rattus norvegicus (Rat). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus, 
[1] 

SEQUENCE FROM N.A. 
TISSUE-BRAIN; 

MEDLINE-97015074; PubMed-8861902; 

Reino-Masu R,, Masu M., Hinck L., Leonardo E.D., Chan S.S.-Y., 
Culotti J.G., Tessier-Lavigne M.; 

"Deleted in Colorectal Cancer (DCC) encodes a netrin receptor."; 
Cell 87:175-185(1996). 



RT 



-!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES, 

■!- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

-!• SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIRE C2-TYPE DOMAINS. 

•!- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIRE DOMAINS. 

-!- SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
TUMOR SUPPRESSOR PROTEIN DCC. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation ■ 
the European Bioinformatics Institute, There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to licenseSisb-sib.ch). 

EMBL; 068726; AAB41100.1; -. 
HSSP; P56276; 1TLR, 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fo3; 6, 
PFAM; PF00047; ig; 4. 

Transmembrane; Immunoglobulin domain; Glycoprotein; Signal. 
NONJTER 1 1 



FT 


SIGNAL 


<1-' 


2 


POTENTIAL. 


FT 


CHAIN 


3 : 


1377 


NEOGENIN, 


FT 


DOMAIN 


3< 


1074 


EXTRACELLULAR (POTENTIAL), 


FT 


TRANSMEM 


1075 


1095 


POTENTIAL. 


FT 


DOMAIN 


1096', 


1377 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


36' 


105 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


135'' 


197 


IG-LIRE C2-TYPE DOMAIN. 


FT 


DOMAIN 


232: 


296 


IG-LIKE C2-TYPE DOMAIN. 


pi 


DUMA IN 






IG'LIKE C2-TYPE DOMAIN, 




DOMAIN 




cni 


FIBRONECTIN TYPE- III. 


I 


DOMAIN 


505. 


598 


FIBRONECTIN TYPE- III. 


FT 


DOMAIN 


599' 


698 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


704 ; 


798 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


819 


919 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


920, 


1021 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


1087'' 


1090 


POLY-VAL. 


FT 


LntWHIiJ 


42: 


42 


N- LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


179 


179 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


295" 


295 


N-LINRED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


43 9 ' ; 


439 


N-LINRED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


458' 


458 


N-LINRED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


608, 


608 


N-LINRED (GLCNAC. , .) (POTENTIAL). 


FT 


CARBOHYD 


684.' 


684 


N-LINRED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


878; 


878 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


SQ 


SEQUENCE 


1377- AA;. 150637 MW; E514ED8ABD1A63A9 CRC64; 



Query Match 12.5%; Score 284,5; DB 1; Length 1377; 

Best Local Similarity 26.3%; Pred. No. 6.5e-10; 

Matches 91; Conservative 55; Mismatches 153; Indels 47; Gaps 11 

Qy 6 GRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAG 65 

h: II .llhll ::::1 : : I :: II I::: III 
Db 232 GQSAVLPCVASGLPAPVIRWMK - - NEDVL — DTESSGRLALLAGGSLEI SDVTEDDAG 285 

Qy 66 YYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLRCKATGDPLP 125 

III' : I : I : I I : || |: III : :|: II I I 

Db 286 TYFCVADNGNKT IEAQAELTV QVPPEFLRQPANIYARESMDIVFECEVTGKPAP 339 

Qy 126 VISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLDVTE 185 

: hi I I ::| lh I II I I hi : I I I I : I 
Db 340 TVKWVKNGDWIPSDYFKIVREH-NLQVLGLVKSDEGFYQCIAENDVGNAQAGAQLIILE 398 

Qy 186 SGATISRNYDLSDLPGPPSRPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEAFSQSVSNS 245 

: ' II I : h : hh II : I : : II : 
Db 399 HAPATT GPLPSAPRDWASLVSTRFIKLTWR--TPASDPHG — DNLTYSVFYT 447 

Qy 246 WQTVA-NHVRTT LYTVRGLRPNTIYLFMVRAINPRVSVTQXKPQKNNGSTWANV 298 

: II hi h: I I I : I : I I II I I : : 

Db 448 REGVARERVENTSQPGEMQVTIQNLMPATVYIFKVMAQNKHGSGESSAPLRVETQPEVQL 507 

Qy 299 PLPPPPVQ-- PLPGT - ELEHYAVEQQENG YDSD 328 

III:: 1 . Ill l:::l : III: 

Db 508 PGPAPNIRAYATSPTSITVTWETPLSGNGEIQNYRLYYMERGTDKE 553 



7 



NE01.CHICK 

ID NEOl.CHICR STANDARD; PRT; 1443 AA, 

AC 090610; 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rel, 40, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEOGENIN (FRAGMENT). 

OS Gallus gallus (Chicken). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus. 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-WHITE LEGHORN; TISSUE-EMBRYONIC BRAIN; 

RX MEDLINE-95105243; PubMed-7806578; 

RA Vielmetter J,, Roman J.M., Dreyer W.J.; 



Best Available Copy 

Mon Jan 22 13:04:48 2001 us-09-540-245a-19.rsp Page 7 



RT "Neogenin, an avian cell surface protein expressed during terminal 

RT neuronal differentiation, is closely related to the human tumor 

RT suppressor molecule deleted in colorectal cancer."; 

RL J. Cell Biol. 127:2009-2020(1994). 

CC -!■ FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 

CC TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 

CC DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 

CC MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADDLT TISSUES. 

CC -I- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

CC -!- DEVELOPMENTAL STAGE: IN RETINA, EXPRESSED ON GANGLION CELL FIBERS 

CC AS SOON AS THEY BEGIN TO EXTEND THEIR AXONS. 

CC -!- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

: CC -I- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC -!■ SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 

I CC TUMOR SUPPRESSOR PROTEIN DCC. 

cc r ,- 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

« between the Swiss Institute of Bioinformatics and the EMBL outstation • 
the European Bioinformatics Institute. There are no restrictions on its 
use by non -profit institutions as long ;as its content is in no way 

CC modified and this statement is not removed.; Usage by and for commercial 

CC entities requires a license agreement (See "http://ww.isb-sib.ch/announce/ 

CC or send an^email to license@isb-sib.ch) . ' 

CC - 

DR EMBL; U07644; AAC59662.1; -. 

DR HSSP; P8Q362; 1WTL. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00Q41; fn3; 6. 

DR PFAM; PF00047; ig; 4. 

KW Transmembrane; Immunoglobulin domain; Glycoprotein. 



FT 


NONJER 


1 


1 




FT 


DOMAIN 


<1 


1090 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


1091 


1111 


POTENTIAL. 


FT 


DOMAIN 


1112 


1443 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


33 


102 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


132 


194 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


229 


293 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


321 


383 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


422 


519 


FIBRONECTIN TYPE-III- 


FT 


DOMAIN 


522 


615 


FIBRONECTIN TYPE-III, 


FT 


DOMAIN 


. 616 


714 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


720 


814 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


' 835 


935 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


936 


1037 


FIBRONECTIN TYPE-III. 




DISULFID 


40 


95 


BY SIMILARITY. 


f 


DISULFID 


139 


187 


BY SIMILARITY. 




DISULFID 


. 236 


286 


BY SIMILARITY, 


FT 


DISULFID 


328 


376 


BY SIMILARITY. 


FT 


CARBOHYD 


39 


39 


N-LINKED (GLCNAC. . .) '(POTENTIAL) . 


FT 


CARBOHYD 


176 


176 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD ' 


292 


292 


N-LINKED (GLCNAC. . ,) (POTENTIAL). 


FT 


CARBOHYD 


456 


* 456 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


475 


475 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


625 


625 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


700 


700 


N-LINKED (GLCNAC. . ,) (POTENTIAL). 


FT 


CARBOHYD 


894 


894 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


so. 


SEQUENCE 


1443 AA; 158050 MW; 558C6795579C0E26 CRC64; 



Query Match 12,4%; Score 282; DB 1; Length 1443; 

Best Local Similarity 24.0%; Pred. No. 9.7e-10; 

Matches 112; Conservative 63; Mismatches 208; Indels 84; Gaps 15; 

Qy 6 GRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAG 65 

I: III I I I M I I : : : : I :: II |::: I I 
Db 229 GQNAVFPCVAGGFPTPYVRWTKNGEELI TEDSERFALRAGGSLLISDVTEEDVG 282 

Qy 66 Y Y ICQALT VAGSILAKAQLEVTDVLTDRPPP I ILQGPANQTLAVDGTALLKCKATGDPLP 125 

III :| 1:1:1 I : II I: III • : :|: II I I 
Db 283 TYTCIADNENETIEAQAELAV QVPPEFLKRPANIYAHESMDIVFECEVTGKPTP 336 

Qy 126 VISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLDVTE 185 



Db 


337 


Qy 


186 


Db 


396 


Qy 


231 


Db 


454 


Qy 


288 


Db 


514 


Qy 


332 


Db 


572 


Qy 


387 


Db 


620 



I ::| II: I II I I 1=1 : I I I I : 



- - SGAT ISKNYDLS - - 

i I : I I: 



■ -DLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPA 230 
III II:: 1:1: II : I 



I hhl I I I 



■ - PLPGT - ELEHYAVEQQENG YDSDSWC 331 
II I ::: : I I ||: 



PPVRGVA- - - SSPAISFGQQSTATLTPSPREE 386 

I II I :| II: I 

■ -FRWAYNKHGPGVSTQDVWRTLSDVP— 619 



1:1 
•-SAAP" 



: :: I :: :| I I I 
- -QNLTLEARNSKS IMLHWQPPPAGTHSG 650 



ID NEOIJOUSE STANDARD; PRT; 1493 AA. 

AC P97798; 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rel. 40, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEOGENIN PRECURSOR. 

GN NEOl OR NGN, ■• 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

RN [1] 

RP SEQUENCE FROM N,A, , AND ALTERNATIVE SPLICING. 

RC TISSUE-BRAIN; . 

RX MEDLINE-97407661; PubMed-9264410; 

RA Keeling S.L., Gad J.M., Cooper H.M.; 

RT "Mouse neogenin, a DCC-like molecule, has four splice variants and is 

RT expressed widely in the adult mouse and during embryogenesis."; 

RL Oncogene 15:691-700(1997). 

CC -!■ FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 

CC TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 

CC DIFFERENTIATED STATE, MAY ALSO FUNCTION AS A CELL ADHESION 

CC MOLECULE IN ! A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

CC -!- SUBCELLULAR •■ LOCAT ION : TYPE I INTEGRAL MEMBRANE PROTEIN. 

CC -!- ALTERNATIVE [PRODUCTS: AT LEAST 5 ISOFORMS; 1 (SHOWN HERE) , 2, 3, 4 

CC AND 5; ARE PRODUCED BY ALTERNATIVE SPLICING. THE EXPRESSION OF 

CC ISOFORMS 3, -4 AND 5 ARE DEVELOPMENTALLY REGULATED. 

CC -!- TISSUE SPECIFICITY: WIDELY EXPRESSED. 

CC -I- DEVELOPMENTAL STAGE: EXPRESSED UBIQUITOUSLY THROUGHOUT THE MID TO 

CC LATE STAGES OF GESTATION AND IN ADULT TISSUES. STRONG EXPRESSION 

CC IS OBSERVED' IN THE VENTRAL REGION OF THE VENTRICULAR ZONE OF THE 

CC E15.5 MOUSE -NEURAL TUBE, AS WELL AS IN THE VENTRICULAR ZONES OF 

CC THE MESENCEPHALON AND RHOMBENCEPHALON. ISOFORMS 3 AND 4 ARE 

CC EXPRESSED AT HIGHER LEVEL COMPARED TO OTHER ISOFORMS BETWEEN Ell, 5 

CC AND E16.5. ' 

CC -I- SIMILARITY : • CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC "I" SIMILARITY: -CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC -I- SIMILARITY: -BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 

CC TUMOR SUPPRESSOR PROTEIN DCC. 

cc 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

GC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and' this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 

CC 



Mon Jan 22 13:04:48 2001 



us-09-540-245a-19.rsp 



Page 



DR EMBL; Y09535; CM70727.1; -, 

DR HSSP; P02751; 1TTG. 

DR MGD; MGI:1097159; NEOl. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 6, 

DR PFAM; PF00047; ig; 4. 

DR PRINTS; PR00014; FNTYPEIII. 

KW Transmembrane; Immunoglobulin domain; Glycoprotein; Signal; 

KW Alternative splicing. 



FT SIGNAL 

FT CHAIN 

FT DOMAIN 

FT TRANSMEM 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

. FT DOMAIN 
DOMAIN 

■ DOMAIN 

TT DOMAIN 

FT DOMAIN 

■FT DOMAIN 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 

FT VARSPLIC 

FT VARSPLIC 



1 


36 


37 


1493 


37 


1136 


1137 


1157 


1158 


1493 


78 


147 


177 


239 


274 


338 


366 


428 


467 


564 


567 


660 


661 


760 


766 


860 


881 


981 


982 


1083 


1149 


1153 


84 


84 


221 


221 


337 


337 


501 


501 


520 


520 


670 


670 


746 


746 


940 


940 


442 


461 


863 


878 


1086 


1096 


1279 


1331 



) (POTENTIAL). 

) (POTENTIAL). 

) (POTENTIAL). 

) (POTENTIAL). 

) (POTENTIAL). 

) (POTENTIAL). 

) (POTENTIAL). 
(POTENTIAL). 



POTENTIAL. 
NEOGENIN. 

EXTRACELLULAR (POTENTIAL) . 
POTENTIAL. 

CYTOPLASMIC (POTENTIAL). 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. 
IG-LIKE C2-TYPE DOMAIN. ■ 
FIBRONECTIN TYffl-III, 
FIBRONECTIN TYffi-IIL 
FIBRONECTIN TYPE-IIL 
FIBRONECTIN TYjfe-111, 
FIBRONECTIN TYPE-III, 
FIBRONECTIN TYBB-III. 
POLY-VAL. 
N- LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC 
N-LINKED (GLCNAC 
N-LINKED (GLCNAC, 
N-LINKED (GLCNAC, 
MISSING (IN ISOFORM 2). 
MISSING (IN ISOFORM 3). 
MISSING (IN ISOFORM 4). 
MISSING (IN ISOFORM 5) 
1493 AA; 163159 MW; 441DE919D5E17C0E C 



• Query Match 12.2%; Score 279; DB 1; Length 1493; 

■ Best Local Similarity 24,3%; Pred. No. 1.5e-09; 
Matches 112; Conservative 66; Mismatches 191; Indels 92; Gaps 18; 

Qy 6 GRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAG 65 

h: II I I I I I I :: : : I : II |::: III 

[ Db 274 GQSAVLPCWSGLPAPWRWMK- -NEEVL- - - -DTESSGRLVLLAGGCLEISDVTEDDAG 327 

166 YYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKATGDPLP 125 
I I : I: :ll I II : II h III : :|: II I I 

328 T YFC - - 1 ADNGNKTVEAQAE — LTVQVPPGFLKQPANI YAHESMDIVFECEVTGKPTP 381 

Qy 126 VISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLDVTE 185 

: hi I I ::| 1 1 : I 1 1 I I I : I : I I I I : I 
Db 382 TVKWVKNGDWI PSDNFK IVKEH - NLQVLGLVKSDEGFYQC I AENDVGNAQAGAQLI ILE 440 

Qy 186 SGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEA- ■ -FSQSV 242 

:h: II: :| I : : III ||:: ; I :: : 
Db ( 441 HDVAIPTLPPT-SLTSATTDHLA— -PATTGPLPSAPRDWASLVSTRFI 486 

Qy 243 SNSWQTVAN--HSKTTLY TVRGLRPNT I YLFMVRA 275 

MM I: I I |:: I I |:|:| I I 

Db 487 KLTWRTPASDPHGDNLTYSVFYTKEGTORERVENTSQPGEMQVTIQNLMPATVYIFKVMA 546 

Qy 276 INPKVSVTQXKPQKNNGSTWANVPLPPPPVQ PLPGT-ELEHYAVE 319 

II I : MM:: || | |:::| : 

Db 547 QNKHGSGESSAPLRVETQPEVQLPGPAPNIRAYATSPTSITVTWETPLSGNGEIQNYKLY 606 

Qy 320 QQENGYDSDSWCPPLPVQTYLHQGLEDELEEDDDRVPTPPVRGVA— SSPAISFGQQST 376 

.Ml: : :| II: I I II I M : 

Db 607 YMEKGTDKEQDI -DVSSHSYTINGLKKYTEYS FRWAYNKHGPGVSTQDVAV 657 



377 ATLTPSPREEMQPM- - -LQASPXFTSSQRPRPTSPFSTDSN 414 

II: I M : :: I M |- II I 
658 RTLSDVPSAAPQNLSLEVRNSKSIVIHWQP PSSTTQN 694 



RESULT 9 
CAMLJOMAN 

ID CAMLJOMAN STANDARD; PRT; 1257 AA. 

AC P32004; 

DT 01-JUL-1993 (Rel, 26, Created) 

DT 01-OCT-1996 (Rel. 34, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEURAL CELL ADHESION MOLECULE Ll PRECURSOR (N-CAM Ll) . 

GN L1CAM OR CAMLl OR MIC5, 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

RN [1] 

RP SEQUENCE FROM N'.A. 

RX MEDLINE-92031698; PubMed-1932117; 

RA Kobayashi M., Miura M., Asou H., Uyemura K.; 

RT "Molecular cloning of cell adhesion molecule Ll from human nervous 

RT tissue: a comparison of the primary sequences of Ll molecules of 

RT different origin,"; 

RL Biochim. Biophys. Acta 1090:238-240(1991), 



[2] 



LA., Coutelle O., Drescher B,; 

RL Submitted (APR-1994) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-92329299; PubMed-1627459; 

RA Reid R,A,, Hemperly J, J.; 

RT "Variants of human Ll cell adhesion molecule arise through alternate 

RT splicing of RNA. "; 

RL J. Mol. Neurosci. 3:127-135(1992). 

RN [4] : 

RP SEQUENCE OF 353-1176 FROM H.A. 

RX MEDLINE-92020233; PubMed-1923824; 

RA Rosenthal A., Mackinnon R.N., Jones D.S.C.; 

RT "PCR walking from microdissection clone M54 identifies three exons 

RT from the human gene for the neural cell adhesion molecule Ll 

RT (CAM-Ll)."; 

RL Nucleic Acids Res. 19:5395-5401(1991). 

RN [5] 

RP SEQUENCE OF 332-371 FROM N.A. 

RX MEDLINE-90353957; PubMed-2387585; 

RA Djabali M., Mattei M.-G., Nguyen C, Roux D., Demengeot J., 

RA Denizot F., Moos M., Schachner M,, Goridis C, Jordan B.R.; 

RT "The gene encoding Ll, a neural adhesion molecule of the 

RT immunoglobulin family, is located on the X chromosome in mouse and 

RT man,"; 

RL Genomics 7:587-593(1990). 

RN [6] 

RP SEQUENCE OF 1030-1257 FROM N.A. 

RX MEDLINE-91132183; PubMed-1993895; 

RA Harper J.R., Prince J.T., Healy P. A., Stuart J.K., Nauman S.J., 

RA Stallcup R.B.; , 

RT "Isolation and sequence of partial cDNA clones of human Ll: homology 

RT of human and rodent Ll in the cytoplasmic region."; 

RL J. Neurochem, 5^:797-804(1991). 

RN [7] 

RP SEQUENCE OF 20-36. 

RX MEDLINE-8829887G; PubMed-3136168; 

RA Wolff J.M., Frank R. , Mujoo K, , Spiro R.C., Reisfeld R.A., 

RA Rathjen F.G.; 

RT "A human brain glycoprotein related to the mouse cell adhesion 

RT molecule Ll.";. '. 

RL J. Biol. Chem. 2:63:11943-11947(1988). 

RN [8] 

RP VARIANT HSAS TYR-264. 

RX MEDLINE-94004956; PubMed=8401576; 

RA Jouet M, , Rosenthal A., Macfarlane J., Kenwrick S., Donnai D.; 
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loucke P., van Camp G., 



of the L1CAM gene." 



RT "A missense mutation confirms the Ll defect in x-linked hydrocephalus 

RT (HSAS),"; 

RL Nat. Genet. 4:331-331(1993). 

RN [9] : 

RP VARIANT HSAS/MASA LEtH194, 

RX MEDLINE-95187172; PubMed-7881431; 

RA Fransen E., Schrander-Stumpel C, Vits L,, 

RA Willems p.j. ; 

RT "X-linked hydrocephalus and MASA syndrome f resent in| one family are 

RT due to a single missense mutation in exon i ' " 

RL Hum. Mol. Genet. 3:2255-2256(1994). 

RN [10] 

RP VARIANTS HSAS GLN-184 AND ARG-452, AND VARIANT MASA GLN-210. 

RX MEDLINE-95004608; PubMed-7920659; 

RA Jouet M,, Rosenthal A., Armstrong G., Macfarlane J., Stevenson R., 

RA >aterson J., Metzenberg A., Ionasescu V., Temple L, Kenwrick S.; 

RT "X-linked spastic^paraplegia (SPGl), MASA syndrome and X-linked 

(hydrocephalus result from mutations in the Ll gene,"; 
Nat. Genet. 7:402-407(1994). 
[U] 

RP VARIANTS MASA GLN-210 AND ASN-598. 

RX MEDLINE=95004609; PubMed=7920660; 

RA Vits L./ van Camp G., Coucke P., Fransen E. , de Boulle K,, 

RA Reyniers E., Korn B., Poustka a,, Wilson G., Schrander-Stumpel c, 

RA Winter R.M., Schwartz C, Willems P.J.; .i 

RT "MASA syndrome is due to mutations in the neural cell adhesion gene 

RT L1CAM " ' 

RL Nat. Genet. 7:408-413(1994). 

RN [12] 

RP VARIANTS HSAS/MASA S-9; S-121; K-309; F-768; L-941 AND C-1070. . 

RX MEDLINE-95282776; PubMed-7762552; | 

RA Jouet M, ; Moncla A,, Paterson J., McKeown c, Fryer A., Carpenter N., 

RA Holmberg E., Wadelius C, Kenwrick S.; I 

RT "New domains of neural cell-adhesion molecule. Ll implicated in 

RT X-linked hydrocephalus and MASA syndrome.";, j 

RL Am. J. Hum. Genet. 56:1304-1314(1995). I , 

RN [13] t 

RP VARIANTS HSAS/MASA Q-184; Q-210; Y-264; R-A2; N-598 AND L-1194. 

RX MEDLINE-96153146; PubMed=8556302; 

RA Fransen E,, Lemmon v., van Camp 6., Vits L., 

RT "CRASH syndrome: clinical spectrum of corpus callosum hypoplasia, 

RT retardation, adducted thumbs, spastic paraparesis and hydrocephalus 

RT due to mutations in one single gene, Ll," 

RL Eur, J, Hum. Genet. 3:273-284(1995). 
[14] 

ERRATUM. 

Fransen E., Lemmon V., van Camp G., Vits L. 
Eur. J. Hum. Genet. 4:126-126(1996). 

"RN [15] 

RP VARIANTS HSAS/MASA/SPG1 SER-179 AND ARG-370'. 

RX MEDLINE-96057511; PubMed-7562969; j 

RA Ruiz J.C., Cuppens H,, Legius E., Fryns J.-P., Glover T, ( Marynen P., 

RA Cassiman J.-J.; 

RT "Mutations in Ll-CAM in two families with X linked complicated 

RT spastic paraplegia, MASA syndrome, and HSAS."; 

RL J. Med. Genet. 32:549-552(1995). 

RN [16] 

RP VARIANTS HSAS CYS-194 AND LEU-240. 

RX MEDLINE-97083370; PubMed-8929944; 

RA Gu S.-M., Orth u., Veske A., Enders H., Kluender K., Schloesser M,, 

RA Engel W,, Schwinger E., Gal A.; 

RT "Five novel mutations in the L1CAM gene in families with X linked 

RT hydrocephalus."; 

RL J. Med. Genet. 33:103-106(1996). 

RN [17] 

RP VARIANTS HSAS Q-184; V-439--T-443 DEL; C-784 AND L-936-L-948 DEL, 

RX MEDLINE=97338664; PubMed-9195224; : ■ 

RA Macfarlane J.R., Du J.-S., Pepys M.E, ,■ Ramsden S., Donnai D., J 

RA Charlton R, , Garrett C, Tolmie J., Yates J.R.W., Berry C, Goudie D., 

RA Moncla A,, Lunt P., Hodgson S., Jouet M., Kenwrick S,; 

RT "Nine novel Ll CAM mutations in families with X-linked 

RT hydrocephalus."; 

RL Hum. Mutat. 9:512-518(1997). 



Coucke P., Willems P.J.; 



Coucke P., Willems P.J.; 



RN [18] 

RP VARIANTS HSAS/MASA ASP-691; ARG-698 AND PRO-935. 

RX MEDLINE-98180721; PubMed-9521424; 

RA Du Y.-Z., Srivastava A.K., Schwartz C,E,; 

RT "Multiple exon screening using restriction endonuclease 

RT fingerprinting (REF) : detection of six novel mutations in the Ll cell 

RT adhesion molecule (LlCAM) gene."; 

RL Hum. Mutat. 11:222-230(1998). 

RN [19] 

RP VARIANT CRASH PRO- 632. 

RX MEDLINE-98112489; PubMed-9452110; 

RA Vits L., Chitayat D., van Camp G., Holden J.J. A., Fransen E. ( 

RA willems P.J.; ■ 

RT "Evidence for somatic and germline mosaicism in CRASH syndrome."; 

RL Hum. Mutat, Suppl. 1:S284-S287(1998). 

RN [20] 

RP VARIANTS HSAS/MASA THR-219; ARG-335; CYS-386; CYS-473 AND LEU-1224. 

RX MEDLINE-984 15726; PubMed-9744477; 

RA Saugier-Veber P., Martin C, le Meur N., Lyonnet S., Munnich A., 

RA David A., Henocq A., Heron D., Jonveaux P., odent S,, Manouvrier S,, 

RA Moncla A., Morichon N., Philip N, , Satge D., Tosi M., Frebourg T,; 

RT "Identification :of novel LlCAM mutations using fluorescence-assisted 

RT mismatch analysis."; 

Hum. Mutat. 12:259-266(1998). 

CC -!- FUNCTION: CELL ADHESION MOLECULE WITH AN IMPORTANT ROLE IN THE 

CC DEVELOPMENT 'OF THE NERVOUS SYSTEM. INVOLVED IN NEURON -NEURON 

CC ADHESION, NEURITS FASCICULATION, OUTGROWTH OF NEURITES, ETC. BINDS 

CC TO AXONIN ON NEURONS. 

CC -!■ SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

:CC -I- ALTERNATIVE PRODUCTS: TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 

|CC PRODUCED BY 'DIFFERENTIAL SPLICING. 

ICC -I- DISEASE: DEFECTS IN LlCAM ARE THE CAUSE OF THREE X-LINKED 

'CC SYNDROMES. 1: HYDROCEPHALUS OWING TO STENOSIS OF THE AQUEDUCT OF 

■CC SYLVIUS .{HSAS) CHARACTERIZED BY MENTAL RETARDATION AND ENLARGED 

*CC BRAIN VENTRICLES. 2: MASA SYNDROME WHICH IS CHARACTERIZED BY 

CC MENTAL RETARDATION, APHASIA, SHUFFLING GAIT, AND ADDUCTED THUMBS. 

CC HAS AN OVERLAPPING PROFILE OF CLINICAL SIGNS WITH HSAS, BUT WITH A 

CC MILDER PRESENTATION AND A LONGER LIFE EXPECTANCY. 3: SPASTIC 

CC PARAPLEGIA TYPE 1 (SPGl). COLLECTIVELY THESE SYNDROMES ARE ALSO 

CC KNOWN AS CRASH SYNDROME, AN ACRONYM WHICH STANDS FOR CORPUS 

CC CALLOSUM HYPOPLASIA, PSYCHOMOTOR RETARDATION, ADDUCTED THUMBS, 

CC SPASTIC PARAPARESIS, AND HYDROCEPHALUS. 

CC -!- DISEASE: DEFECTS IN LlCAM ARE THE CAUSE OF HIRSCHPRUNG DISEASE 

CC (HSCR). 

CC -I- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN- LIKE C2-TYPE DOMAINS. 

CC -I- SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE HI-LIKE DOMAINS. 

CC -!- DATABASE: NAME-LICAM; NOTE=LlCAM mutation Web Page; 

CC WWW- "http : //hgins . uia . ac . be/dnalab/11 " . 

CC r 

CC This SWISS-PROT.'entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in nc •»■?.;• 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/anrour.cs/ 

CC or send an email to licensefiisb-sib.ch). 

; 

DR EMBL; X59847; CAA42508.1] 

DR EMBL; Z29373; CAA82564.1; 

DR EMBL; M74387; AAA59476.1; 

DR EMBL; X58775; CAA41576.1; 

Query Match 11.84; Score 268; DB 1; Length 1257; 

Best Local Similarity 23.8%; Pred. No. 5.7e-09; 

Matches 97; Conservative 46; Mismatches 143; Indels 122; Gaps 

Qy ; :;6. GRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAG 65 

I I I: <:l II! I I: I I : : : : I I ::|:| II 

Db 347 GETARLDCQVQGRPQPEVTWRING IPVEELAKDQKYRIQ-RGALILSNVQPSDTM 400 

Qy 66 YYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDG-TALLKCKATGDPL 124 

hi } :H I : I : I II =11 I II I III I h 
Db 401 VTQCEARNRHGLLLANAYIYWQL PAKILTADNQTYMAVQGSTAYLLCKAFGAPV 455 
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Qy 125 PVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLDV- 183 

I : II I I :l I III |::|: :lll I I : I : : I I I 
Db 456 PSVQWLDEDGTTVLQDERFFPYMGTLGIRDLQANDTGRYFCIJ\ANDQNMIMAEjKVK 515 

Qy 184 TESGA 188 

I I 

Db 516 DATQITQGPRSTIEKKGSRVTFTCQASFDPSIflPSITWRGDGRDLOElGDSDKYFIEDGR 575 

Oy 189 TISKNYDLSD LPGPPSKPQVTD- - -VTKHSVTLSW 220 

: : I II III : ::f :|:: I :|| 

Db 576 LVIHSLDYSDQGNYSCVASTELDWESRAQLLWGSPGPVPRLVLSDLHLLTQSQVRVSW 635 

Qy 221 QPGTPGTLPASAY I IEAFSQSVS - NSWQTV — ANHVKTTLYTVRGLRPNT I YLFMVRA 275 

I I I || : :: I :: I 111 I I I I I I 
Db 636 SPAEDHNAPIEKYDIEFEDKEMAPEKWYSLGKVPGNQTSTTL KLSPYVHYTFRVTA 691 

Oy 276 IN PKVSVTQXKPQKN NGSTWANVPLPPPPYQ 306 

II I : |:|| I: |: : I j 
Db 692 INKYGPGEPSPVSETWTPEAAPEKNPVDVKGEGNETTNMVITWKPIl 739 

tSULT 10 !>' 

01 HUMAN ' ! 

ID NEOIJUMAN STANDARD; PRT; 1461 AA. 

AC Q92859; 000340; 

DT 01-OCT-2000 (Rel. 40, Created) 

DT 01-OCT-2000 (Rel. 40, Last sequence update) & 

DT 01-OCT-2000 (Rel. 40, Last annotation update) § 

DE NEOGENIN PRECURSOR. I 

GN NEOl OR NGN. | 

OS Homo sapiens (Human). ] 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Homlnidae; flomo. 

RN [1] 1 

RP SEQUENCE FROM N.A. (ISOFORMS 1 AND 2), 

RC TISSUE-FETAL BRAIN; 

RX MEDLINE-97236653; PubMed-9121761; 

RA Meyerhardt J. A., Look A.T., Bigner S.H., Fearon E.R.tf 

RT "Identification and characterization of neogenin, a Dec-related 

RT gene."; 

RL Oncogene 14:1129-1136(1997). 

RN [2] 

RP SEQUENCE FROM N.A. (ISOFORMS 1 AND 2). 

RC TISSUE-FETAL BRAIN; ; 

RX MEDLINE-97312699; PubMed-9169140; • 

RA Vielmetter J., Chen X.-N., Miskevich F., Lade R.P., Yamakawa K. , 

RA Korenberg J.R., Dreyer W.J.; 4 

RT "Molecular characterization of human neogenin, a DCC-related protein, 

«and the mapping of its gene (NEOl) to chromosomal position 15q22.3- 
q23."; 
Genomics 41:414-421(1997). 

CC -!- FUNCTION: MAY BE INVOLVED AS A REGULATORY PROTEIN IN THE 
CC TRANSITION OF UNDIFFERENTIATED PROLIFERATING CELLS TO THEIR 
CC DIFFERENTIATED STATE. MAY ALSO FUNCTION AS A CELL ADHESION 
CC MOLECULE IN A BROAD SPECTRUM OF EMBRYONIC AND ADULT TISSUES. 

CC -!- SUBCELLULAR LOCATION: TYPE I INTEGRAL MEMBRANE PROTEIN. 

CC •!- ALTERNATIVE PRODUCTS: AT LEAST 2 ISOFORMS; 1 (SHOWN HERE) AND 2; 
CC ARE PRODUCED BY ALTERNATIVE SPLICING. 

CC •!• TISSUE SPECIFICITY: WIDELY EXPRESSED AND ALSO IN CANCER CELL 
CC LINES. 

CC -!- SIMILARITY: CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

. CC -!- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE ZII-LIKE DOMAINS. 

CC -I- SIMILARITY: BELONGS TO THE IMMUNOGLOBULIN SUPERFAMILY. STRONG, TO 
CC TUMOR SUPPRESSOR PROTEIN DCC. 

cc 

CC This SWISS-PROT entry is copyright, It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute, There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to licenseSlsb-sib.ch). 



cc * 

DR EMBL; U61262; AAB17263.1; -. 

DR EMBL; U72391; AAC51287.1; -. 

DR MIM; 601907; v' 

DR HSSP; P02751; 1TTG. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 6. 

DR PFAM; PF00047; 'ig; 4. 

DR PRINTS; PR00014; FNTYPEIII. 

KW Transmembrane; Immunoglobulin domain; Glycoprotein; Signal; 

KW Alternative splicing, 
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FT 


CONFLICT 


168 


168 


G ->'N (IN REF. 2). 




SQ 


SEQUENCE 


1461 AA; 159958 MW; 7AAE897E69635A21 CRC64; 



Query Match 11.74; Score 267.5; DB 1; Length 1461; 

Best Local Similarity 24.6%; Pred. No. 7.4e-09; 

Matches 86; Conservative 50; Mismatches 160; Indels 53; Gaps 

Qy 6 GRTVTFPCETKGNPQPAVFWQK EGSQNLLFPNQPQQPNSRCSVSPTGDLTITN 58 

1:111111:11 11:1: : III:: 

Db 263 GQDWLPCVASGLPTPTIKWMKNEEALDTESSERLV LLAGGSLEISD 309 

Qy 59 IQRSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCK 118 

: III I 'I I :l 1:1:1 I : I Ml : :|: 

Db 310 VTEDDAGTYFCIADNGNETIEAQAELTV QAQPEFLKQPTNIYAHESMDIVFECE 363 

Qy 119 ATGDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWS 178 

INI: i:l I I ::| II: I Ml I hi : I I 
Db 364 VTGKPTPTTOIVKNGDMVIPSDYFKIVKEH-NLQVLGLVKSDEGFYQCIAENDVGNAQAG 422 

Qy 179 AVLDVTESGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPAS- • -AYII 235 

11:1 = III : I: : hh II : I I : 
Db 423 AQLIILEHAPATT GPLPSAPRDWASLVSTRFIKLTWR--TPASDPHGDNLTYSV 475 

Qy 236 EAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINPKVSVTQXKPQKNNGSTW 295 

: :: • :| |:: I I |:|:| I I I I I : 
Db 476 FYTKEGIARERVENTSHPGEMQVTIQNLMPATVYIFRVMAQNKHGSGESSAPLRVETQPE 535 

Qy 296 ANVPLPPPPVQ PLPGT ■ ELEHYAVEQQENG YDSD 328 

:| I I :: |: I |:::| : III: 

Db 536 VQLPGPAPNLRAYAASPTSITVTWETPVSGNGEIQNYKLYYMEKGTDKE 584 
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RESULT 11 
CAML.MOUSE 

ID CAML.MOUSE STANDARD; PRT; 1260 AA. 
P11627; 

01-OCT-1989 (tel. 12, Created) 
01-OCM989 (Rel. 12, Last sequence update) 
01-OCT-2000 (Rel. 40, Last annotation update) 
NEURAL CELL ADHESION MOLECULE Ll PRECURSOR (N-CAM Ll). 
L1CAM OR CAML1, 
Mus musculus (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus, 
[1] 

SEQUENCE FROM N.A., AND PARTIAL SEQUENCE. 
TISSUE-BRAIN; 

MEDLINE-88318924; PubMed-3412448; 

Moos M., Tacke R., Scherer H., Teplow D., Frueh K., Schachner M.; 
"Neural adhesion molecule Ll as a member of the immunoglobulin 
superfamily with binding domains similar to fibronectin."; 
Nature 334:701-703(1988). 

-I- FUNCTION; CELL ADHESION MOLECULE WITH AN IMPORTANT ROLE IN THE 
DEVELOPMENT OF THE NERVOUS SYSTEM. INVOLVED IN NEURON-NEURON 
ADHESION, NEURITE FASCICULATION, OUTGROWTH OF NEURITES, ETC, BINDS 
. TO AXONIN ON NEURONS . 
-!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 
-!- ALTERNATIVE PRODUCTS: TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 

PRODUCED BY DIFFERENTIAL SPLICING (BY SIMILARITY) . 
-!;- SIMILARITY: CONTAINS 6 IMMUNQGLOBULIN-LIKE C2-TYPE DOMAINS. 
-!- SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS, 



RX 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://ww.isb-sib.ch/announce/ 
or send an email to licenseGisb-sib.ch). 



DR EMBL; X12875; CAA31368.1; -. 
DR PIR; S05479; S05479. 
DR HSSP; P20241; 1CFB. 
DR MGD; MGI: 96721; LlCAM. 
DR INTERPRO; IPR001777; -. 
DR INTERPRO; IPR003006; -. 

«PFAM; PF00041; £n3; 4. 
PFAM; PF00047; ig; 6. 
PRINTS; PR00014; FNTYPEIII. 
kw Cell adhesion; Glycoprotein; Transmembrane; Repeat; Brain; 
KW Immunoglobulin domain; Signal; Alternative splicing, 
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N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


293 


293 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


432 


432 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


478 


478 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


489 


489 


N-LINKED (GLCNAC. . .) (POTENTIAL), 





CARBOHYD 


504 . 


504 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


587 


587 


N-LINKED (GLCNAC. , 


.) (POTENTIAL) 




CAKUUHlD 


670: 


670 


N-LINKED (GLCNAC. . 


,) (POTENTIAL) 


FT 


CARBOHYD 


725, 


III 


N-LINKED (GLCNAC. . 


. ) (POTENTIAL) 


FT 


CARBOHYD 


776' 


776 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 




PAT3T3AUVTS 


824 ' 


824 


N-LINKED (GLCNAC, , 


,) (POTENTIAL) 


FT 






848 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


UwJUniL) 


875 ' 




N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


968. 


968 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


978 . 


978 


N-LINKED (GLCNAC. , 


.) (POTENTIAL) 


FT 


CARBOHYD 


1022 


1022 


N-LINKED (GLCNAC. . 


,) (POTENTIAL) 


FT 


CARBOHYD 


1030 . 


1030 


N-LINKED (GLCNAC. . 


,) (POTENTIAL) 


FT 


CARBOHYD 


1073 


1073 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


1107 ; 


1107 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


VARSPLIC 


1180 


1183 


MISSING (IN SHORT ISOFORM) 


FT 








(BY SIMILARITY) . 




SQ 


SEQUENCE 


1260 AA; 140968 


MW; 22BE57001CB2A538 CRC64; 



Query Match 11,61; Score 265,5; DB 1; Length 1260; 

Best Local Similarity 21,3*; Pred. No. 8.1e-09; 

Matches 122; Conservative 70; Mismatches 197; Indels 183; Gaps 19; 

Qy 6 GRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAG 65 

I I I: :| III : I: I :: I: |: I I ::|:| :| 

Db 346 GET ARLDCQVQG RPQPEI T WRI NG -MSMET VNKDQK YR I E QGSLILSNVQPTDTM 399 

Qy 66 YYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQT-LAVDG-TALLKCKATGDP 123 

hi 1:11 I : I : I I: : III :ll:| II MM I I 
Db 400 VTQCEARNQH3LLLANAYI YWQL — PARILTKD - - NQT YMAVEGSTAYLLCKAFGAP 453 

Qy 124 LPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLDV 183 

=1 : II I I :| I III l::|: :||| III: : I I I 
Db 454 VPSVQWLDEE3TTVLQDERFFPYANGTLSIRDLQANDTGRYFCQAANDQNNVTILANLQV 513 

Qy 184 TES G 187 

I: I 
Db 514 KEATQITQGPRSAIEKKGARVTFTCQASFDPSLQASITWRGDGRDLQERGDSDKYFIEDG 573 

Qy 188 ATISKNYDLSD LPGPPSKPQVTD- - -VTKNSVTLS 219 

: :: I II III :::| : :: I II 

Db 574 KLVIQSLDYSDQGNYSCVASTELDEVESRAQLLWGSPGPVPHLELSDRHLLKQSQVHLS 633 

Qy 220 WQPGTPGTLPASAYIIEAFSQSVS-NSW QTVANHVKTTLYTVRGLRPNTIYLFMVR 274 

II hill::: : I III Mill 

Db 634 WSPAEDHNSPIEKYDIEFEDKEMAPEKWFSLGKVPGNQTSTTL — KLSPYVHYTFRVT 689 

Qy 275 AIN — -PKVSVTQXKPQKN NGSTWANVPLPPPPVQPLPGTELE - HY 316 

III : I : |:|| I: I: : h: : : I 

Db 690 AINKYGPGEP3PVSESWTPEAAPEKNPVDVRGEGNETNNMVITWKPLRWMDWNAPQIQY 749 

Qy 317 AVEQQENGYD SDSWCPPLPVQTYLHQGLEDELEEDDDRVPTPPVRGVASSPA 368 

I: : I • II : |:: ;: : : : | | 
Db 750 RVQWRPQGKQETWRKQTVSDPFLWSNTSTFVPYEIKVQAVNNQGKGPEP 799 

Qy 369 ISFGQQSTATLTPSPREEMQPMLQASPXFTSSQ--RPRPT 406 

I I ; : :: I I: I II III 
Db 800 QVTIGYSGEDYPQVSPELEDITIFNSSTVLVRWRPVDLAQVKGHLKGYPTYWWK 854 

Qy 407 - - -SPFSTDSNTSAALSQSQRP 425 

: I :||::|: II 
Db 855 GSQRKHSKRHIHKSHIWPANTTSAILSGLRP 886 



RESULT 12 "> 
CAML.RAT 

ID CAMLJAT STANDARD; PRT; 1259 AA. 

AC Q05695; * 

DT 01-FEB-1994 (Rel. 28, Created) 

DT 01-OCT-1994 (Rel, 30, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEURAL CELL ADHESION MOLECULE Ll PRECURSOR (N-CAM Ll), 

GN LlCAM OR CAMLl , 1 



Mon Jan 22 13:04:48 2001 



us-09-540-245a-19.rsp 



Page 12 



OS 


Rattus norvegicus (Rat). 




FT 


CARBOHYD 1021' 1021 N-LINKED (GLCNAC. . .) (POTENTIAL). 


oc 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 


FT 


CARBOHYD 1029'' 1029 N-LINKED (GLCNAC. . .) (POTENTIAL). 


oc 


Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; .Murinae; Rattus. 


FT 


CARBOHYD 1072- 1072 N-LINKED (GLCNAC. . .) (POTENTIAL). 


RN 
RP 


[1] 

SEQUENCE 


FROM N.A. 




■ 5 


FT 
FT 


CARBOHYD 1106 : 1106 N-LINKED (GLCNAC. . .) (POTENTIAL). 
VARSPLIC 1179' 1182 MISSING (IN SHORT ISOFORM) . 


RX 


MEDLINE=91372414; 


PubMed-1894011; 


SQ 


SEQUENCE 1259 ;AA; 140934 MW; 19681B022D8F24AB CRC64; 


RA 


Miura M., Kobayashi M., Asou H., Oyemura K.; 






RT 


"Molecular cloning of cDNA 


incoding the rat neural cell adhesion 






RI 


molecule LI. Two LI isoforms in the cytoplasmic region are produced 


Qi 


ery Match 11,5%; Score 261.5; DB 1; Length 1259; 


RT 
RL 


by differential splicing."; 
FEBS Lett. 289:91-95(1991). 




Best Local Similarity 21.8%; Pred. No, 1.4e-08; 

Matches 125; Conservative 65; Mismatches 198; Indels 185; Gaps 


CC 


-!• FUNCTION: CELL ADHESION 


MOLECULE WITH AN IMPORTANT ROLE IN THE 






CC 


DEVELOPMENT OF THE NERV 


)US SYSTEM. INVOLVED IN NEURON-NEURON 


Qy 


6 GRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAG 65 


CC 


ADHESION, NEURITE FASCICULATION, OUTGROWTH OF NEURITES, ETC. BINDS 




1 I I: :l III 1 |: 1 :: |: |: II ::|:| II 


CC 


TO AXONIN ON N 


IEURONS. 




Db 


346 GET ARLDCQVQGRPQPEVTWRI NG • MS I EKVNKDQKYRIE QGSLILSNVQPSDTM 399 


CC 


-!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 




CC 


-!- ALTERNATIVE PRODUCTS: TWO ISOFORMS IN THE CYTOPLASMIC REGION ARE 


Qy 


66 YYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQT-LAVDG-TALLKCKATGDP 123 




PRODUCED BY DIFFERENTIAL SPLICING. 




1:1 1 :|| 1 : 1 : II:: III :lhl II 1 III 1 1 




-!• TISSUE SPECIFICITY: THE 


SHORTER ISOFORM IS PREDOMINANTLY FOUND IN 


Db 


400 VTQCEARNQHGLLLANAYIYWQL — PARILTKD--NQTYMAVEGSTAYLLCKAFGAP 453 


CC 


THE BRAIN, WHILE THE LO 


1GER ISOFORM IS FOUND IN THE PERIPHERAL 








NERVOUS SYSTEM, 




Qy 


124 LPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLDV 183 




-!- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 




:| : II M :| 1 II |::|: :lll III: : 1 1 1 


i 


-!- SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS. 


Db 


454 VPSVQWLDEEGTTVLQDERFFPYANGHLGIRDLQANDTGRYFCQAANDQNNVTILANLQV 513 


CC 


This SWISS-PROT entry is copyright. (it is produced through a collaboration 


Qy 


184 TES G 187 


CC 


between 


the Swiss institute of Bioinformatics and .the EMBL outstation • 




1: > 1 


CC 


the Europ 


ean Bioinformatics Institute. There are no restrictions on its 


Db 


514 KEATQITQGPRSTIEKKGARVTFTCQASFDPSLQASITWRGDGRDLQERGDSDKYFIEDG 573 


CC 


use by 


non-profit institutions as long as its content is in no way 






CC 


modified and this statement is not removed. Usage by and for commercial 


Qy 


188 ATISKNYDLSD LPGPPSKPQVTD- • -VTKNSVTLS 219 


CC 


entities requires a license 


agreement (See http://www.isb-sib.ch/announce/ 




: 1: 1 II III :::l : :: 1 II 


CC 
CC 
DR 
DR 


or send an email to license@isb-sib.ch) . 


Db 


574 QLVIKSLDYSDQGDYSCVASTELDEVESRAQLLWGSPGPVPHLELSDRHLLKQSQVHLS 633 


EMBL; X59149; CAA41860.1; - 
PIR; S17655; S17655, 




Qy 


220 WQPGTPGTLPASAYIIEAFSQSVS-NSW QTVANHVKTTLYTVRGLRPNTIYLFMVR 274 

II | | || : :: 1 : 1 III 1 1 1 1 1 


DR 


HSSP; P20241; 1CFB. 




Db 


634 WSPAEDHNSPIEKYDIEFEDKEMAPEKWFSLGKVPGNQTSTTL-— KLSPYVHYTFRVT 689 


DR 


INTERPRO 


IPR001777; -. 








DR 


INTERPRO, 


IPR003006; -. 




Qy 


275 AIN PKVSVTQXKPQKN NGSTWANVPLPPPPVQPLPGTELE • HY 316 


DR 


PFAM; PF00041; £n3; 4. 






III 1 : |:|| |: |: : |:: : : 1 
690 AINKYGPGEPSPVSETWTPEAAPEKNPVDVRGEGNETNNMVITWKPLRWMDWNAPQIQY 749 


DR 


PFAM; PF00047; ig 


6. 




Db 


DR 


PRINTS; PR00014; FNTYPEIII. 








KW 


Cell adhesion; Glycoprotein; Transmembrane; Repeat; Brain; 


Qy 


317 AVEQQENGYDSDSWCPPLPVQTYLHQGLED ELEEDDDRVPTPPVRGVASSP 367 


KW 


Immunoglobulin domain; Signal; Alternative splicing. 




1: •■■ 1 1 :|: 1 : 1 : =1 :l 1 
750 RVQ ,•; - -WRPLGKQETWKEQTVSDPFLWSNTSTFVPYEIKVQAVNNQGKGPEP 799 


FT 


SIGNAL 


1 


19 


BY SIMILARITY. 


Db 


FT 


CHAIN 


20 


1259 


NEURAL CELL ADHESION MOLECULE LI, 






FT 


DOMAIN 


20 


1122 


EXTRACELLULAR (POTENTIAL). 


Qy 


368 AISFGQQSTATLTPSPREEMQPMLQASPXFTSSQ- - -RPRPT 406 


FT 


TRANSMEM 


1123 


1145 


POTENTIAL. 




:: 1 :: | |: | || 1 II 


FT 


DOMAIN 


1146 


1259 


CYTOPLASMIC (POTENTIAL). 


Db 


800 QVTIGYSG EDYPQVSPELEDITIFNSSTVLVRWRPVDLAQVKGHLRGYNVTYWW 853 


FT 


DOMAIN 


50 


120 


IG-LIKE C2-TYPE DOMAIN. 






FT 


DOMAIN 


150 


215 


IG-LIKE C2-TYPE DOMAIN. 


Qy 


407 SPFSTDSNTSAALSQSQRP 425 


I 


DOMAIN 


256 


318 


IG-LIKE C2-TYPE DOMAIN. 




" 1 :||::|: II 


I 


DOMAIN 


346 


410 


IG-LIKE C2-TYPE DOMAIN. 


Db 


854 KGSQRKHSKRHVHKSHMWPANTTSAILSGLRP 886 


WI 


DOMAIN 


440 


503 


IG-LIKE C2-TYPE DOMAIN. 






FT 


DOMAIN 


531 


599 


IG-LIKE C2-TYPE DOMAIN. 






FT 


DOMAIN 


827 


896 


FIBRONECTIN TYPE-III. 


RESULT 13 


FT 


DOMAIN 


932 


994 


FIBRONECTIN TYPE-III. 


DCCJOUSE 


FT 


DOMAIN 


1032 


1093 


FIBRONECTIN TYPE-III. 


ID 


DCCJOUSE STANDARD; PRT; 1447 AA. 


FT 


SITE 


553 


555 


CELL ATTACHMENT SITE (POTENTIAL) . 


AC 


P70211; 


FT 


SITE 


562 


564 


CELL ATTACHMENT SITE (POTENTIAL) . 


DT 


01-NOV-1997 (Rel. 35, Created) 


FT 


. CARBOHYD 


100 


100 


N- LINKED (GLCNAC. . ,) (POTENTIAL), 


DT 


01-NOV-1997 (Rel. 35, Last sequence update) 


FT 


CARBOHYD 


202 


202 


N-LINKED (GLCNAC. , ,) (POTENTIAL). 


DT 


30-MAY-2000 (Rel. 39, Last annotation update) 


FT 


CARBOHYD 


246 


246 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


DE 


TUMOR SUPPRESSOR PROTEIN DCC PRECURSOR. 


FT 


CARBOHYD 


293 


293 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


GN 


DCC. 


FT 


CARBOHYD 


432 


432 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


OS 


Mus musculus (Mouse) . 


FT 


CARBOHYD 


489 


489 


N-LINKED (GLCNAC. , .) (POTENTIAL). 


OC 


Eukaryota; Metaz,oa; Chordata; Craniata; vertebrata; Euteleostomi; 


FT 


CARBOHYD 


504 


504 


N-LINKED (GLCNAC. . ,) (POTENTIAL). 


OC 


Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus, 


FT 


CARBOHYD 


670 


670 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


RN 


[1] 


FT 


CARBOHYD 


725 


725 


N-LINKED (GLCNAC. . .') (POTENTIAL). 


RP 


SEQUENCE FROM N.A. 


FT 


CARBOHYD 


776 


776 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


RC 


STRAIN-BALB/C; TISSUE-BRAIN; 


FT 


CARBOHYD 


824 


824 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


RX 


MEDLINE-96112625; PubMed-8570174; 


FT 


CARBOHYD 


848 *>848 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


RA 


Cooper H.M., Armes P., Britto J., Gad J., Wilks A.F.; 

"Cloning of the mouse homologue of the deleted in colorectal cancer 


FT 


CARBOHYD 


875 


875 


N-LINKED (GLCNAC. , ,) (POTENTIAL). 


RT 


FT 


CARBOHYD 


968 


968 


N-LINKED (GLCNAC. . ,) (POTENTIAL). 


RT 


gene (mDCC) and' its expression in the developing mouse embryo."; 


FT 


CARBOHYD 


978 


978 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


RL 


Oncogene 11 : 2243 -2254 (1995) . 



Mon Jan 22 13:04:48 2001 



us-09-540-245a-19.rsp 
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RN [2] 

RP REVISIONS. 

RC STRAIN=BALB/C; TISSUE-BRAIN; 

RA Cooper H.Mi; 

RL Submitted (JUN-1996) to the EMBL/GenBank/DDBJ databases. 

CC -!- FUNCTION: IMPLICATED AS A TUMOR SUPPRESSOR GENE, 

CC -!• SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN, 

CC •!■ ALTERNATIVE PRODUCTS: TWO FORMS OP THE PROTEIN ARE PRODUCED FROM 

CC THE SAME GENE BY THE USE OF ALTERNATIVE INITIATION SITES. A THIRD 

CC FORM WHICH IS EXPRESSED ONLY IN THE EMBRYO IS PRODUCED BY 

CC ALTERNATIVE SPLICING. 

CC -I- TISSUE. SPECIFICITY: IN THE EMBRYO, EXPRESSED AT HIGH LEVELS IN THE 
CC DEVELOPING BRAIN AND NEURAL TUBE. IN ADULT, HIGHLY EXPRESSED IN 
CC BRAIN WITH VERY LOW LEVELS FOUND IN TESTIS, HEART AND THYMUS. 

CC -!- DEVELOPMENTAL STAGE: LOW LEVELS IN EARLY GESTATION. HIGHEST LEVELS 
CC EXPRESSED DURING MID GESTATION. LEVELS DECREASE IN LATE GESTATION 
CC AND REMAIN AT THIS LEVEL IN THE ADULT. 

«-!- SIMILARITY : CONTAINS 4 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS . 
■!- SIMILARITY: CONTAINS 6 FIBRONECTIN TYPE III-LIKE DOMAINS. 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC *r send an email to licenseSisb-sib.ch) . 

DR EMBL; X85788; CA$9786.1; -. 

DR HSSP; P56276; 1TLK. 

DR MGD; MGI: 94869; DCC. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 6, 

DR PFAM; PFQ0047; ig; 4. 

DR PRINTS; PR00014; FNTYPEIII. 

KW Glycoprotein; Immunoglobulin domain; Transmembrane; Signal; 

KW Anti -oncogene; Alternative initiation; Alternative splicing. 



FT 


SIGNAL 


1 


25 


POTENTIAL. 


FT 


CHAIN 


26 


1447 


TUMOR SUPPRESSOR PROTEIN DCC, LONG 


FT 








ISOFORM. 


FT 


CHAIN 


85 


1447 


TUMOR SUPPRESSOR PROTEIN DCC, SHORT 


FT 








ISOFORM. 


FT 


INIT.MET 


85 


85 


FOR SHORT ISOFORM. 


FT 


DOMAIN 


26 


1097 


EXTRACELLULAR (POTENTIAL). 




TRANSMEM 


1098 


1122 


POTENTIAL. 


fi 


DOMAIN 


1123 


1447 


CYTOPLASMIC (POTENTIAL). 




DOMAIN 


54 


124 


I6-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


154 


219 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


254 


317 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


345 


407 


IG-LIKE C2-TYPE DOMAIN, 


FT 


DOMAIN 


426 


522 


FIBRONECTIN TYPE- I II. 


FT 


DOMAIN 


525 


618 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


619 


716 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


722 


816 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


840 


940 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


941 


1042 


FIBRONECTIN TYPE-III. 


FT 


DISULFID 


61 


117 


BY SIMILARITY. 


FT 


DISULFID 


161 


212 


BY SIMILARITY. 


FT 


DISULFID 


261 


310 


BY SIMILARITY. 


FT 


DISULFID 


352 


400 


BY SIMILARITY. 


FT 


CARBOHYD 


60 


60 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


94 


94 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


299 


299 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


318 


318 


N-LINKED (GLCNAC. . ,) (POTENTIAL). 


FT 


CARBOHYD 


478 


478 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


628 


628 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


702 


702 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


VARSPLIC 


819 


838 


MISSING (IN EMBRYONIC ISOFORM) . 


SQ 


SEQUENCE 


1447 AA; 158298 


MW; 0D1F1097C22D5B9F CRC64 ; 


Query Match 




11.4%; 


Score 259; DB 1; Length 1447; 



Best Local Similarity 25.44; Pred, No. 2.4e-Q8; 



Matches 


115; Conservative 44; Mismatches 160; Indels 134; G 


ps 


Qy 


6 


GRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAG 
1 II ll-l 1 1 : III 1:1 1 ill 1 |:| 1 : :| |:| 
GDTVLLKCEVIGEPMPT I HWQK - NQQDL - - ■ -NPLPGDSRVWLPSGALQISRLQPGDSG 


65 


Db 


154 


208 


Qy 


66 


YYICQALTVAGSILAKAQLEVTDVLTDRPPPI ILQGPANQTLAVDG-TALLKCKA 

III I II : II :hl 1 : II hi :|::| |:|:| 
VYRCSARNPA-SIRTGNEAEVR-ILSD--PGLHRQLYFLQRPSN-VIAIEGKDAVLECCV 


119 


Db 


209 


263 


Qy 


120 


TGDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSA 


179 


Db 
Qy 


264 
180 


: 1 1 :lh 1 : :: III: hllllll 1 : 1 II 
SGYPPPSFTWLRGEEVIQLRSKKYSLLGGSNLLISNVTDDDSGTYTCWTYKNENISASA 

VLDV TESGAII SKNYDL 


323 
196 


Db 


324 


II : II : II !: 
ELTVLVPPWFLNHPSNLYAYESMDIEFECAVSGKPVPTVNWMKNGDWIPSDYFQIVGGS 


383 


Qy 


197 


- SDLPGPPSKPQVTDVTKN 


214 


Db 


384 


1 II 1 I: 
NLRILGWKSDEGFYQCVAENEAGNAQSSAQLIVPKPAIPSSSILPSAPRDVLPVLVSSR 


443 


Qy 


215 


SVTLSWQPGTPGTLPASAY - I IEAFSQSVSNSWQTVANHVKTT LYTVRGLRPNTI 

1 1 1 1 • 1 III 1 • I • I < 1 1 1 1 i • i 
1 1 1 1 ■ 1 - III 1 • 1 • 1 -II 1 ! 1 . 1 ■ 

FVRLSWRP PAEAKGNIQTFTVFFSREGDNRERALNTTQPGSLQLTVGNLKPEAM 


268 


Db 


444 


497 


Qy 


269 


YLFMVRAINPKVSVTQXKPQKNNGSTWANVPLPPPPVQPLPGTELEHYAVEQQENGYDSD 
Mill : I : : : | :|| :|| 


328 


Db 


498 


YTFRWAYN-r EWGPGESSQPIKVATQPELQVPGPVENLHAVSTSPTSI-LI 


546 


Qy 


329 


SWCPPL— -PVQTY LHQGLEDELEED 351 

:| II III 1 : 1 1 :| 1 
TWEPPAYANGPVQGYRLFCTEVSTGKEQNIEVD 579 




Db 


547 





RESULT 14 
NRG_DROME 

ID NRGJ3ROME STANDARD; PRT; 1239 AA. 

AC P20241; Q24414;. 

DT 01-FEB-1991 (tel. 17, Created) 

DT 01-FEB-1991 (tel. 17, Last sequence update) 

DT 01-OCT-2000 (Rel. 40, Last annotation update) 

DE NEUROGLIAN PRECURSOR. 

GN NRG. , ' 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drbsophilidae; Drosophila. 

RN [1] 

RP SEQUENCE FROM NiA., AND SEQUENCE OF 24-41 AND 737-751. 

RX MEDLINE-90030418; PubMed-2805067; 

RA Bieber A.J., Snow P.M., Hortsch M., Patel NX, Jacobs J.R., 

RA Traquina Z.R., Schilling J., Goodman C.S.; 

RT "Drosophila neuroglias a member of the immunoglobulin super family 

RT with extensive homology to the vertebrate neural adhesion molecule 

RT LI.'; 

RL Cell 59:447-460(1989). 

RN [2] 

RP SEQUENCE OF 1182-1239 FROM N.A, 

RX MEDLINE-90262720; PubMed-1693086; 

RA Hortsch M . , Bieber A . J . , Patel N.H., Goodman C.S. ; 

RT "Differential splicing generates a nervous system-specific form of 

RT Drosophila neuroglias"; 

RL Neuron 4:697-709(1990). 

RN [3] 

RP X-RAY CRYSTALLOGRAPHY (2.0 ANGSTROMS) OF 610-814, 

RX MEDLINE-94213741; PubMed-7512815; 

RA Huber A.H., Wang Y.-M.E., Bieber A.J., Bjorkman P.J.; 

RT "Crystal structure of tandem type III fibronectin domains from 

RT Drosophila neuroglian at 2,0 A."; 

RL Neuron 12:717-731(1994). 

CC -!- FUNCTION: THIS PROTEIN MAY PLAY A ROLE IN NEURAL AND GLIAL CELL 
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CC ADHESION IN THE DEVELOPING DROSOPHILA EMBRYO. 

CC -!- SUBCELLULAR LOCATION: TYPE I MEMBRANE PROTEIN. 

CC -!- ALTERNATIVE PRODUCTS: 2 ISOFORMS; A LONG FORM (SHOWN HERE) AND A 
CC SHORT FORM; ARE PRODUCED BY ALTERNATIVE SPLICING. 

CC -!- TISSUE SPECIFICITY: NEURONS AND GLIA IN THE DEVELOPING NERVOUS 
CC SYSTEM AND ON SOME OTHER NONNEURONAL TISSUES. 

CC -!• SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIRE C2-TYPE DOMAINS. 

CC *!• SIMILARITY: CONTAINS 5 FIBRONECTIN TYPE III-LIKE DOMAINS. 
CC 



t 



CC This SWISS -PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss institute of Bioinformatics and the embl outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Osage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to licenseGisb-sib.ch), \ 

cc 

DR EMBL; M28231; AAA28728.1; ALT_SEQ. ' t 

DR EMBL; X76243; CAA53822.1; -, 

DR PIR; A32579; A32579. 
^R PDB; 1CFB; 30-NOV-94, 
A FLYBASE; FBqnOOC'2963; Nrg. 
H INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; 

DR PFAM; PF00041; fn3; 5. 

DR PFAM; PF00047; ig; 6. 

KW Cell adhesion; Glycoprotein; Transmembrane; Repeat; lb-structure; 
KW Immunoglobulin domain; Signal; Embryo; Alternative i licing. 

FT SIGNAL 1 23 J 

FT CHAIN 24 1239 NEUROGLIAL 

24 1138 EXTRACELLULAR (POTENTLf) , 

1139 1154 POTENTIAL, 

1155 ^239 CYTO&ASMIC (POTENTIAL 

53 123 IG-L1 " 

224 IG-LJ 

329 IG-LI 

422 I6-I 

512 IG-Lf 

606 IG-LM 

690 FIBRG 

792 FIBRG 

896 FIBR(p 
FIBRC 



FT DOMAIN 

FT TfcANSMEM 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DISULFID 

FT DISULFID 

FT CARBOHYD 

FT CARBOHYD 

FT CARBOHYD 
CARBOHYD 
CARBOHYD 
CARBOHYD 

FT CARBOHYD 



149 
262 
354 
447 
536 
629 
729 
832 
932 
1024 
59 
625 
182 
198 
411 
448 
652 
683 
821 



3 C2-TYPE DOMAIN 
3 C2-TYPE DOMAIN; 
3 C2-TYPE DOMAIN! 
3 C2-TYPE DOMAIN! 
3 C2-TYPE DOMAIN, 
3 C2-TYPE DOMAIN, 
HECTIN TYPE-IIL 
3CTIN TYPE-IIL 
SCTIN TYPE-IIL 
ECTIN TYPE-IIL 
FIBRONECTIN TYPE-IIL 
POTENTIAL. 



FT CARBOHYD 1125 1125 
FT CONFLICT 1234 1234 
FT CONFLICT 1237 1237 



N-LINKED (GLCNAC. 
N- LINKED (GLCNAC, 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
N-LINKED (GLCNAC. 
T -> Y (IN REF. 2), 
L -> K (IN REF. 2). 



.)' POTENTIAL). 

.) (POTENTIAL). 

.) (POTENTIAL). 

.) (POTENTIAL). 

.) (POTENTIAL). 

.) (POTENTIAL). 

.) (POTENTIAL). 

.) (POTENTIAL). 



SQ SEQUENCE 1239 AA; 138284 MW; 49E12692D0DD027D CRC64; 



Query Match 11.3%; Score 258; DB 1; Length 1239; 

Best Local Similarity 21.6%; Pred. No. 2.2e-08; 

Matches 95;. Conservative 51; Mismatches 177; Indels 116; Gaps 14; 

Qy 4 AQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSD 63 

I: III Ihhl I I : II I :|: : I I: : I 
Db 351 AEDEEWFECRAAGVPEPKISWIHNGK PIEQSTPNPRRTVT-DNTIRIINLVKGD 404 

Qy 64 AGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDG-TALLKCKATGD 122 

I I I I I : II llhll III HI: I 

Db 405 TGNYGCNATNSLGYVYKDVYLNV QAEPPTISEAPA-AVSTVDGRNVTIKCRVNGS 458 

Qy 123 PLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVLD 182 
I I:: II: I I :| I |:|::: II I III I : II I 



Db 459 PKPLVKWLRASNWLTG--GRYNVQANGDLEIQDVTFSDAGKYTCYAQNKFGEIQADGSLV 516 

Qy 183 VTE t 185 

I I 

Db 517 VKEHTRITQEPQNYEVAAGQSATFRCNEAHDDTLEIEIDWWKDGQSIDFEAQPRFVKTND 576 

Qy 186 SGATISKNYDLSD LPGPPSKPQVTDVT-KNSVTLSWQP 222 

: 11:1 :'! : |: |::| :| : : I: 

Db 577 NSLTIAKTMELDSGEYTCVARTRLDEATARANLIVQDVPNAPKLTGITCQADKAEIHWEQ 636 

Qy 223 GTPGTLPASAYIIEAFSQSVSNSWQTVANHVKTT-LYTVRGLRPNTIYLFMVRAINPKV 280 

III:: II I I : I: : I I I I I I I: 
Db 637 QGDNRSPILHYTIQFNTSFTPASWDAAYEKVPNTDSSFWQ-MSPWANYTFRVIAFN-KI 694 

Qy 281 SVTQXKPQKNNGSTWANVPLPPPP VQPLPGTELEHYAVEQQENGYDS 327 

: :': :| :|| I : I |:|| I I : 
Db 695 GASPPSAHSDSCTTQPDVPFKNPDNWGQGTEPNNLVISWTPMPEIEHNA PNFHYY 750 

Qy 328 DSWCPPLPVQTYLHQGLED 346 
! || :| : : : I 

Db 751 VSWKRDIPAAAWENNNIFD 769 



RESULT 15 
AXOIJUMAN 

ID AXOIJUMAN STANDARD; PRT; 1040 AA. 

AC Q02246; 

bT 01-JUL-1993 (Rel. 26, Created) 

DT 01-JUL-1993 (Rel. 26, Last sequence update) 

DT 15-JUL-1999 (Rel. 38, Last annotation update) 

DE AXONIN-1 PRECURSOR (AXONAL GLYCOPROTEIN TAG-1) (TRANSIENT AXONAL 

DE GLYCOPROTEIN 1), 

GN TAXI OR TAG1. ■, 

OS Homo sapiens (Human). 

OC EuXaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=BRAIN; '■ 

RX MEDLINE-93145965; PubMed-8425542; 

RA Hasler T.H., Rader C, Stoeckli E.T., Zuellig R.A., Sonderegger P.; 

RT "cDNA cloning, structural features, and eucaryotic expression of 

RT human TAG-l/axoriin-1."; 

RL Eur. J. Biochem; 211:329-339(1993). 

RN [2] 

RP SEQUENCE FROM N.A, 

RC TISSUE=BRAIN; / 

RX MEDLINE-94140354; PuiMed-8307567; 

RA Tsiotra CP,, Karagogeos D., Theodorakis K., Michaelidis M.T., 

RA Modi W.S., Furley J. A., Jessel M.T., Papamatheakis J.; 

RT "Isolation of the cDNA and chromosomal localization of the gene 

RT (TAXI) encoding the human axonal glycoprotein TAG-1."; 

RL Genomics 18:562-567(1993). 

CC -!- FUNCTION: MAY PLAY A ROLE IN THE INITIAL GROWTH AND GUIDANCE OF 
CC AXONS. MAY BE INVOLVED IN CELL ADHESION. 

CC -!- SUBCELLULAR LOCATION: ATTACHED TO THE NEURONAL MEMBRANE BY A 
CC GPI-ANCHOR AND IS ALSO RELEASED FROM NEURONS. 

CC -I- SIMILARITY: CONTAINS 6 IMMUNOGLOBULIN-LIKE C2-TYPE DOMAINS. 

CC -!- SIMILARITY: CONTAINS 4 FIBRONECTIN TYPE III -LIKE DOMAINS. 
CC 



CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 

CC the European Bioinformatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch) . 



EMBL; X68274; CAA48335.1; -. 
EMBL; X67734; CAA47963.1; -. 
PIR; S28830; S26830, 
MIM; 190197; -, , 

IPR001777; -. 
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DR 


INTERPRO; IPR003006; -. 




DR 


PFAM; PF00041; fn3; 4. 




DR 


PFAM; PF00047; ig; 6. 




KW 


Immunoglobulin domain; Glycoprotein; Signal; GPI-anchor; 


RW 


Cell adhesion; Repeat. 




FT 


SIGNAL 


1 


28 




FT 


CHAIN 


29 


1012 


AXONIN-1. 


FT 


PROPEP 


1013 


1040 


REMOVED IN MATURE FORM (POTENTIAL) , 


FT 


DOMAIN 


54 


118 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


148 


216 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


254 


313 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


341 


402 


IG-LIKE C2-TYPE DOMAIN. 




DOMAIN 


433 


495 


IG-LIKE C2-TYPE DOMAIN. 


FT 


DOMAIN 


523 


594 


IG-LIKE C2-TYPE DOMAIN. 


™ 


DOMAIN 


606 


612 


GLY/PRO-RICH. 




DOMAIN 


611 


706 


FIBRONECTIN TYPE-III. 


FT 


DOMAIN 


714 


809 


FIBRONECTIN TYPE-III. 


i 


DOMAIN 


816 


908 


FIBRONECTIN TYPE-III. 


■ 


DOMAIN 


917 


1003 


FIBRONECTIN TYPE-III. 




SITE 


794 


796 


CELL ATTACHMENT SITE (BY SIMILARITY). 


d! 


CARBOHYD 


76 


76 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


198 


198 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


204 


204 


N-LINKED (GLCNAC. . .) (POTENTIAL) , 




CARBOHYD 


461 


461 


N-LINKED (GLCNAC. . .) (POTENTIAL), 


FT 


CARBOHYD 


477 


477 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


498 


498 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


525 


525 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


830 


830 


N-LINKED (GLCNAC, , .) (POTENTIAL). 


FT 


CARBOHYD 


918 


918 


N-LINKED (GLCNAC. . .) (POTENTIAL) . 


FT 


CARBOHYD 


940 


940 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


LIPID 


1012 


1012 


GPI-ANCHOR (POTENTIAL) . 


SO 


SEQUENCE 


1040 AA; 113393 MW; 254E78DD3C28EFB6 CRC64; 



Job time: 1292 sec 



Query Match 11,2%; Score 256; DB 1; Length 1040; 

Best Local Similarity 24.4%; Pred. No. 2.4e-08; 

Matches 108; Conservative 59; Mismatches 151; Indels 124; Gaps 

Qy 4 AQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSD 63 

hi : Ih : h II I |:: |: :|| :|:| I I I II III 

Db 431 ARGGEILIPCQPRAAPKAWLWSK-GTEILV NSSRVTVTPDGTLIIRNISRSD 482 

Qy 64 AGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKATGDP 123 

I I I I : : II I :h I II:::: hi h II 
Db 483 EGKYTC FAENFMGKA--NSTGILSVRDATKITLAPSSADINLGDNLTLQCHASHDP 536 

»124 LP- -VISWLKEGFTF- - - -PGRDPRAT- IQEQ-GTLQIKNLRISDTGTYTCVATSSSGEA 175 
:| : || I ::| I I I I :: | l||: 
Db 537 TMDLTFTWTLDDFPIDFDKPGGHYRRTNVKETIGDLTILNAQLRHGGKYTCMA 589 

> Qy 176 SWSAVLDVTESGATISKNyDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYII 235 
hi Ih : Nil I h :: III I I : I : 
Db 590 -QTWDSASKEATVL— -VRGPPGPPGGVWRDIGDTTIQLSWSRGFDNHSPIAKYTL 643 

Qy 236 EAFSQSVSNSWQTV — AN- ■ -HVKTTLYTVRGLRPNTI YLFMVRAIN 277 

:| : : h I II : :l I IN I I I I I 
Db 644 QARTPP-AGKWKQVRTNPANIEGNAETA-QVLGLTPWMDYEFRVIASNILGTGEPSGPS 700 

Qy 278 PKVSVTQXKPQ KNNGST-WAN 297 

h : I : III I 

Db 701 SKIRTREAAPSVAPSGLSGGGGAPGELIVNWTPMSREYQNGDGFGYLLSFRRQGSTHWQT 760 

Qy 298 VPLPPPPVQPLPGTELEHYAVEQQENGYDSDSWCPPLP — VQTYLHQG LE 345 

Ml : ::: I ::| I I :::| :| | 

Db 761 A RVPGADAQYFV YSNESVRPYTPFEVKIRSYNRRGDGPESLTALV 805 

Qy 346 DELEEDDDRVPTPP-VRGVASS 366 

1 II: II :||:|| 
Db 806 YSAEEEPRVAPEKVWAKGVSSS 827 



Search completed: January 22, 2001, 12:29:51 
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GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM protein - protein search, using sw model 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



parched: 



January 22, 2001, 12:54:03 ; Search time 559.fi 

(without alignments) 

90.856 Million cell updates/sec 

US-09-540-245A-19 

2280 | 

1 QIVAQGRTVTFPCETKGNPQ TSAALSQSQRPRPTKKHKGG 434 

BLOSUM62 1 
' Gapop 10.0 , GapextfO.5 1 

374700 seqs, 117207^15 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 ] 
Maximum DB seq length: 2000000000 | 



Post-processin 



: Minimum Match 0% 
Maximum Match 100% 
Listing first 45 sui tmaries 

SPTREMBLJ5:* 

1: sp.archea:* 

1 2: spjacteria:* 

13: sd fungi:* 

|4: sfrhuman:* 

5: sp.invertebrate:* 

j 6: spjiammal:* 

| 7: spjllhc : * 

! 8: sp_organelle:* 

1 9: sp_phage : * 

10: sp_plant:* 

11: sp_rodent:* 

12: sp.virus:* 

13: sp.vertebrate:* 

14: sp.unclassified:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution, 



SUMMARIES 



% 

Query 



NO. 


Score Match Length 


DB 


ID 


Description 


DR EMBL; AF040991;' AAC39576.1; -. 
















DR INTERPRO; IPR001777; -. 


1 


1479 


64,9 


285 


4 


043608 


043608 homo sapien 


DR INTERPRO; IPROO'3006; ■. 


2 


1409 


61.8 


1060 


11 


Q9QZI3 


Q9qzi3 rattus norv 


DR PFAM; PF00041; fn3; 1, 


3 


913 


40.0 


1651 


4 


Q9Y6N7 


Q9y6n7 homo sapien 


DR PFAM; PF00047; ig; 2. 


4 


911 


40.0 


1612 


11 


089026 


089026 mus musculu 


FT NON.TER 1 1 


5 


911 


40.0 


1651 


11 


055005 


055005 rattus norv 


FT NON.TER 285 285 


6 


755.5 


33.1 


1344 


11 


Q9Z2I4 


Q9z2i4 mus musculu 


SQ SEQUENCE 285 AA; 30606 MW; 05DF916A3DBA96C6 CRC64; 


7 


579 


25.4 


1273 


5 


044928 


044928 caenorhabdi 




8 


545 


23,9 


1395 


5 


044924 


044924 drosophila 




9 


545 


23.9 


1395 


5 


Q9W213 


Q9w213 drosophila 


Query Match 64.9%; Score 1479; DB 4; Length 285; 


10 


478 


21.0 


823 


5 


Q9VQ10 


Q9vql0 drosophila 


Best Local Similarity 100.0%; Pred, No. 8.9e-104; 


11 


417 


18.3 


859 


5 


Q9VPZ6 


Q9vpz6 drosophila 


Matches 284; Conservative 0; Mismatches 0; mdels 0; Gaps 


12 


388 


17.0 


874 


5 


001632 


001632 caenorhabdi 




13 


336.5 


14.8 


2221 


5 


Q901M1 


Q9uM drosophila 


Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 


14 


331.5 


14.5 


2222 


5 


097394 


097394 drosophila 


IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMHIIIIIIIIIIIII 


15 


330 


14.5 


1788 


13 


Q9IAJ0 


Q9iaj0 xenopus lae 


Db 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 


16 


330 


14.5 


1904 


11 


Q64699 


Q64699 mus musculu 




17 


325 


14.3 


1499 


13 


Q90815 


Q90815 gallus gall 


Qy 61 RSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKAT 120 


18 


323.5 


14.2 


1501 


11 


Q9QWO0 


Q9qw00 rattus sp. 




19 


323.5 


14.2 


1863 


11 


Q64605 


Q64605 rattus norv 


Db 61 RSDAGYYICQAITVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKAT 120 



20 


322.5 


14.1 


1948 


4 


Q13332 


Q13332 homo sap: en 


21 


319.5 


14.0 


475 


4 


075255 


075255 homo sapien 


22 


316 


13.9 


1894 


11 


Q64487 


Q64487 mus muscuiu 


23 


314 


13.8 


1502 


4 


Q9UM81 


Q9um81 homo sapien 


24 


308.5 


13.5 


1896 


13 


Q9IAJ1 


Q9iajl xenopus lae 


25 


305.5 


13.4 


1898 


11 


Q64604 


Q64604 r protein -t 


26 


300.5 


13.2 


1598 


4 


Q9P214 


Q9p214 homo sapien 


27 


297.5 


13.0 


1232 


13 


Q90284 


Q90284 carassius a 


28 


297 


13.0 


1264 


5 


P91767 


P91767 manduca sex 


29 


297 


13.0, 


1887 


11 


Q9QW67 


Q9qw67 rattus sp, 


30 


296.5 


13.0 


1026 


11 


Q62845 


Q62845 rattus norv 








1277 






Q98902 fugu rubrip 


32 


288 


12.6 


2016 


5 


A(HM TO 


Q9v4 j 9 drosophila 


33 


288 


12 6 


2016 


^ 




Q9nbal drosophila 


34 


287.5 


12.6 


1180 


4 


015051 


O15051 homo sapien 


35 


285.5 


12.5 


1299 


4 


Q92823 


Q92823 homo sapien 


36 


285.5 


12.5 


1496 


4 


Q92626 


Q92626 homo sapien 


37 


284.5 


12.5 


1377 


11 


P97603 


P97603 rattus norv 


38 


283!s 


12J 


1099 


11 


P97527 


P97527 rattus norv 


39 


282 


12.4 


1443 


13 


Q90610 


Q90610 gallus gall 


40 


281.5 


12.3 


1236 


4 


Q90HI3 


Q9uhi3 homo sapien 


41 


281.5 


12.3 


1308 


4 


Q90HI4 


Q9uhi4 homo sapien 


42 


281 


12.3 


793 


11 


070246 


070246 mus musculu 


43 


279 


12.2 


1493 


11 


P97798 


P97798 mus musculu 


44 


278 


12.2 


920 


4 


Q9P232 


Q9p232 homo sapien 


45 


276 


12.1 


1028 


11 


Q62682 


Q62682 rattus norv 



PRT; 285 AA, 



RESULT 1 
043608 

ID 043608 PRELIMINARY; 

AC 043608; 

DT Ql-JtJN-1998 (TrEMBLrel. 06, C 

DT 01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

DT 01-MAY-2000 (TrEMBLrel, 13, Last annotation update) 

DE ROUNDABOUT 2 (FRAGMENT). 

GN ROB02. 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBIJaxID-9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-98117249; PubMed-9458045; 

RA Ridd T., Brose K., Mitchell K.J., Fetter R.D., Tessier-Lavigne M,, 

RA Goodman C.S., Tear G.; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily. 'of evolutionarily conserved guidance receptors."; 
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0y 121 GDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIMIIIIIIIIIMIIIIIIIIIIII 
Db 121 GDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 

Qy 181 LDVTESGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTIiPASAYIIEAFSQ 240 

IIMIIIIIIIIIIIIIIIIIIIIIMMIIIIIIIIIIIIIIIIIMIIIIIIIMIII 
Db 181 LDVTESGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEAFSQ 240 

Qy 241 SVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINPKVSVTQ 284 

! 1 1 1 M 1 1 1 1 i 1 1 ! I! 1 1 1! If 1 1 1 1 1 1 1 1 M i 1 1 1 M I 

Db 241 SVSNSWQTVANHVRTTLYTVRGLRPNTIYLFMVRAINPRVSVTQ 284 



RESULT 2 
Q9QZI3 

ID Q9QZI3 PRELIMINARY; PRT; 1060 AA. 

AC Q9QZI3; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

dt 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

)T 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

k REPULSIVE GUIDANCE RECEPTOR (FRAGMENT) , 
Rattus norvegicus (Rat). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

OX NCBI TaxID-10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-99200391; PubMed-10102268; 

RA Brose K., Bland K.S., Wang R.H., Arnott D., Henzel W., Goodman C.S., 

RA Tessier-Lavigne M., Kidd T.; 

RT "Slit proteins bind Robo receptors and have an evolutionarily 

RT conserved role in repulsive axon guidance,"; 

RL Cell 96:795-806(1999). 

DR EMBL; AF182037; AAF04558.1; -. 

DR HSSP; P56276; 1TLK. 

DR INTERPRO; IPR001547; -. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

DR PRINTS; PR00014; FNTYPEIII. 

DR PROSITE; PS00659; GLYCOSYL JYDROLJ5 ; UNRNOWN.1. 



FT NON.TER 1060 1060 

SQ SEQUENCE 1060 AA; 116790 MW; C4BC8C11E8542DA4 CRC64; 



Query Match 61.8%; Score 1409; DB 11; Length 1060; 

•Best Local Similarity 64.7%; Pred. No. B.le-98; 
Matches 303; Conservative 28; Mismatches 81; Indels 56; Gaps 10; 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

IIIIIIIIMIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 
Db 326 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 385 

Qy 61 RSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKAT 120 

1 1 1 M I [ I [ I ! I M i f I [ 1 1 1 F 1 1 1 1 J 1 1 1 1 i 1 1 1 1 1 1 F 1 1 1 IMIIIIIIIIMMII 
Db 386 RSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPINQTLAVDGTALLKCKAT 445 

Qy 121 GDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 

I llllllllimil llllllllhlllllllllllllllllllllllllll lllll 

Db 446 G-PLPVISWLKEGFTFLGRDPRATIQDQGTLQIKNLRISDTGTYTCVATSSSGETSWSAV 504 
Qy 181 LDVTESGATISRNYDLSDLPGPPSRPQVTDVTRNSVTLSWQPGTPGTLPASAYIIEAFSQ 240 

iiiiiiiiiiiiiii :iiiiiiiiiiiiiiiiiiiiiiii mi illinium 

Db 505 LDVTESGATISKNYDTNDLPGPPSKPQVTDVTKNSVTLSWQTGTPGVLPASAYIIEAFSQ 564 

Qy 241 SVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINPKVSVTQXRPQKNNGSTWANVPL 300 

IIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIII: :: I : I : 
Db 565 SVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINPQ-GLSDPSPMSDPVRT---QDI 620 



Qy 301 PPP-- 



■ -PVQPLPGTELEHYAVEQQENGYDSDSWCPPLPVQT 338 



II llll: l::| I :| 

Db 621 SPPAQGVDHRQVQRELGDVTVRLHNPWLTPTTVQVTWTVDRQ PQFIQG 669 

Qy 339 Y LHQGLEDEL--EEDDDRVPTPPVRGVASSPAISFGQQSTATLTPSPR-EEMQPM 390 

I lh : I Mil | III 

Db 670 YRVMYRQTSGIjQASTVWQNLDARVPTE RSAVLVNLRKGVTYEIRVRPYFNEFQGM 724 

Qy 391 LQASPXFTSSQR PRPTSPFSTDSNTSAALSQSQRPRPTRRHRG 433 

I ::: I : : : I : I : : I I I I I 
Db 725 DSESRTIRTTEEAPSAPPQSVTVLTVGSHNSTSISVSWDPPPADHQNG 772 



RESULT 3 
Q9Y6N7 

ID Q9Y6N7 PRELIMINARY; PRT; 1651 AA. 

AC Q9Y6N7; 

DT 01-NOV-1999 (TrEMBLrel . 12, Created) 

DT 01-NOV-1999 (TrEMBLrel. 12, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE ROUNDABOUT 1. - 

GN ROBOl. 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 

OX NCBI_TaxID-9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-98117249; PubMed-9458045; 

RA Ridd T., Brose R., Mitchell R.J., Fetter R.D., Tessier-Lavigne M. , 

RA Goodman C.S., Tear G.; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily of evolutionarily conserved guidance receptors."; 

RL Cell 92:205-215(1998), 

DR EMBL; AF040990; AAC39575.1; -. 

DR HSSP; P56276; 1TLR. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; 

t)R PFAM; PF00041; fn3; 3. 

f)R PFAM; PF00047; ig; 5. 

SQ SEQUENCE 1651, AA; 180928 MW; 9D98CD7CAB73074D CRC64; 



Query Match 40.0%; Score 913; DB 4; Length 1651; 

Best Local Similarity 23.2%; Pred. No. 2.8e-60; 

Matches 258; Conservative 54; Mismatches 118; Indels 684; Gaps 12; 

Qy 1 QIVAQGRTVTFPCETRGNPQPAVFWQREGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

1:11 MM II lll!l!:||::|llimi II I HI III IIIIIIIM 
Db 360 QWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQ 419 

Qy 61 RSDAGYYICQALTVAGSILARAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKAT 120 

ill linn i inn: n mim mini in imiiin n i n 

Db 420 RSDVGYYICQTLNVAGSIITRAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTFVLSCVAT 479 

Qy 121 GDPLPYISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASHSAV 180 

I hi I I hi :| I II III: :: III llhl:: 1 1 1 [ : ! 1 1 
Db 480 GSPVPTILWRKDGVLVSTQDSRIKQLENGVLQIRYAKLGDTGRYTCIASTPSGEATWSAY 539 

Qy 181 LDVTESGATIS--RNYDLSDLPGPPSKPQVTDVTRNSVTLSWQPGTPGTLPASAYIIEAF 238 

:: I I : : I : :| II 1 1': 1 1 1 1 : :.l : 1 1 1 1 1 1 1 ::|lllll 
Db 540 IEVQEFGVPVQPPRPTDPNLIPSAPSRPEVTDVSRNTVTLSWQPNLNSGATPTSYIIEAF 599 

Qy 239 SQSVSNSWQTVANHVRTTLYTVRGLRPNTIYLFMVRAIN 277 

I : nilill :||| ::l|:|| 111:11 I 
Db 600 SHASGSSWQTVAENVRTETSAIRGLRPNAIYLFLVRAANAYGISDPSQISDPVKTQDVLP 659 

Qy 278 - 277 

Db 660 TSQGVDHRQVQRELGNAVLHLHNPTVLSSSSIEVHWTVDQQSQYIQGYRILYRPSGANHG 719 

Qy 278 : PR 279 

I 

Db 720 ESDWLVFEVRTPAKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPP 779 
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Qy 

Db 

Qy 
Db 
Qy 
Db 
Qy 

Db 

• 

Qy 
Db 



280 VSVIQXK 286 

II I 

780 QGVTVSKNDGNGTAILVSKQPPPEDTQNGMVQEYKVWCLGNETRyHINKTVDGSTFSWI 839 

287 PQ 288 

II 

840 PFLVPGIRYSVEVAASTGAGSGVKSEPQFIQLDAHGNPVSPEDQVSLAQQISDWKQPAF 899 

289 KNNG 292 

I II 

900 IAGIGAACWIILMVFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSG 959 



293 • 



--STWAN-- 

II I 



960 GRPGLLNISEPAAQPWLADTWPNTGNNHNDC IISCCTAGNGNSDSNLTTYSRPADCIANY 1019 



297 

1020 nnqldnrqtnlmlpestvygdvdlsnkineiStfnspnlkdgrfvnpsgqptpyattqli 1079 



3 osklsnnmnngsgdsgekhwrpwqqkqevapvqyniveqmlnkdyrandtvppt ipyn 



■ 297 
i 1139 



Qy 298 - :--VP LPPPPVQPLPGTEL 313 

, I II Mill I I ; 

Db 1140 QSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPPPKQGGMNWADLLPPPPAHPPPHSNS 1199 

Qy 314 EHYAVEQQENGYDSDSWCPPLPVQTYLHQGLEDEL-EEDDDRVPTPPVRGVASSP-AISF 371 

I I i I: II : II I : II I III 1 1 : 1 : 1 lllllll Mil I'h 
Db 1200 EEYNESVDES-YDQEMPCPVPPARMYLQQ---DELEEEEDERGPTPPVRGAASSPAAVSY 1255 



• 395 



Qy 372 GQQSTATLTPSPREEMQPMLQASP- - 

I M : l M 1 1 1 1 : 1 ! r 1 1 1 ! | | 
Db 1256 SHQSTATLTPSPQEELQPMLQDCPEETGHMQBQPDRRRQPVSPPPPPRPISPPHTYGYIS 1315 

Qy 396 \ 395 

Db 1316 GPLVSDMDIDAPEEEEDEADlfeVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWG 1375 

Qy 396 , 



f 



XFTSS 400 

I :| 

1376 SASEEDNISSGRSSVSSSDGSIFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHAS 1435 

401 QRPRPTSPFSTDSNTSAALSQSQRPRPTKKHKGG 434 

I llllll Mill III: l! II II: I 
1436 QCPRPTSPVSTDSNMSAAVMQRTRPAKKLKHQPG 1469 



RESULT 4 
089026 
ID 089026 
AC 
DT 



PRELIMINARY; 



PRT; 1612 AA. 



8 (TrEMBLrel. 08, Created) 
8 (TrEMBLrel. 08, Last sequence update) 
(TrEMBLrel. 15, Last annotation update) 



01-NOV-19 8 
01-NOV-19 8 
01-OCT-20 0 . 
DUTT1 PROTEIN. 
ROBOl OR DUTTl . 

Mus musculus (Mouse). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostorai; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
NCBIJaxID-10090; 
[1] 

SEQUENCE FROM N.A. 
TISSUE=BRAIN; 

Wu M.C., Lowe N., Fordham R., Rabbitts P.; 

"The mouse homologue of human DUTTl/H-robol gene: protein sequence and 

chromosomal location."; 

Submitted ( JUL-1998) to the EMBL/GenBank/DDBJ databases. 

EMBL; Y17793; CAA76850.1; ■. 

HSSP; P56276; 1TLK. 

MGD; MGI:1274781; Robol. 

t 



INTERPRO; IPR001777; ■. 
INTERPRO; IPRQQ30Q6; ■, 
PFAM; PF00041; fn3; 3. 
PFAM; PF00047; ig; 5. 
SEQUENCE 1612 AA; 176406 MW; 



5F2988C544796B4B CRC64; 



Query Match - 40.0%; Score 911; DB 11; Length 1612; 

Best Local Similarity 22.8%; Pred. No. 3.9e-60; 

Matches 254; Conservative 55; Mismatches 120; indels 686; 



Gaps 10; 



Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

hii hum ii iiiiiimmiiiiiM 11 i mi mi miiiim 

Db 321 QWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTITNVQ 380 

Qy 61 RSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKAT 120 

III lllllll Mill: M IMMI: MMIM Ml llhlllll M I II 
Db 381 RSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTLILSCVAT 440 

Qy 121 GDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 

I II I I 1M Ml I I Ml: :: Ml III I:: MIIMII 
Db 441 GSPAPTILWRKDGVLVSTQDSRIKQLESGVLQIRYAKLGDTGRYTCTASTPSGEATWSAY 500 



Qy 181 

Db 501 

Qy 239 

Db 561 

Qy 278 

Db 621 

Qy 278 

Db 681 

Qy 280 

Db 741 

Qy 284 

Db 801 

Qy 284 

Db 851 

Qy 284 

Db 921 

Qy 284 

Db 981 



LDVTESGAT IS" KNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAY I IEAF 238 

|| : : I : :| 1 1 1 1 : 1 Ml M 1 : 1 1 III II :: 1 1 1 1 1 1 

IEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSKNTVTLSWQPNLNSGATPTSYIIEAF 560 

SQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAIN 277 

I : MMIM MM : ::MMI IIIIMM I 

SHASGSSWQTAAENVKTETFAIKGLKPNAIYLFLVRAANAYGISDPSQISDPVKTQDVPP 620 



TSQGVDHKQVQRELGNWLHLHNPTILSSSSVEVHWTVDQQSQYIQGYKILYRPSGASHG 



ESEWLVFEVRTPTKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFAKTLEEAPSAPP 



VSVT 

Ill 

RSVTVSKNDGNGTAILVTWQPPPEDTQNGMVQEYKVWCLGNETKYHINKTVDGSTFSWI £ 



PSLVPGIRYSVEVAASTGAGPGVKSEPQFIQLDSHGNPVSPEDQVSLAQQISDWRQPAF 8 



IAGIGAACWIILMVFSIWLYRHRKKRNGLTSTYAGIRKVPSFTFTPTVTYQRGGEAVSSG 



GRPGLLNISEPATQPWLADTWPNTGNNHNDCSINCCTAGNGNSDSNLTTYSRPADCIANY 



NNQLDNKQTNLMLPESTVYGDVDLSNKINEMKTFNSPNLKDGRFVNPSGQPTPYATTQLI 



283 
920 
283 
980 
283 
1040 



Qy 284 -— 

Db 1041 QANLSNNMNNGAGDSSEKHWKPPGQQKPEVAPIQYNIMEQNKLNKDYRANDTIPPTIPYN 1100 



■-QXKPQ-- 

I II: 



Qy 289 '- KNNGSTWANVPLPPPPVQPLPGTE 312 

I I M:: HIM I I 
Db 1101 QSYDQNTGGSYNSSDRGSSTSGSQGHKKGARTPKAPKQGGMNWADL-LPPPPAHPPPHSN 1159 

6y 313 LEHYAVEQQENGYDSDSWCPPLPVQTYLHQGLEDEL-EEDDDRVPTPPVRGVASSP-AIS 370 

II: I: II : II I III III IIMM lllllll Mil IM 
Db 1160 SEEYNMSVDES-YDQEMPCPVPPAPMYLQQ— DELQEEEDERGPTPPVRGAASSPAAVS 1215 



371 FGQQSTATLT ?SPREEMQPMLQASP ■ " 

: IIIIIMIIIMIMIIII I 
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Db 1216 YSHQSTATLTPSPQEELflPMLflDCPEDLGHMPHPPDRRRQPVSPPPPPRPISPPHTYGYI 1275 

Oy 396 395 

Db 1276 SGPLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGW 1335 



-XFTS 399 

I 



DT 

1 



Oy »396 
Db 1336 



Qy 400 SQRPRPTSPFSTDSNTSAALSQSQRPRPTRRHRGG 434 

I! Hill III II : | || :||: I 
Db 1396 SQCPRPTSPVSTDSNMSAWIQKARPAKKQKHQPG 1430 



RESULT 5 
055005 

ID 055005 PRELIMINARY; PRT; 1651 AA. 

AC 055005; 

DT 01-JUN-1998 (TrEMBLrel. 06, Created) 

DT 01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

*■ 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

TRANSMEMBRANE RECEPTOR ROBOl. 

IS Rattus norvegicus (Rat) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentla; Sciurognathi; Muridae; Murinae; Rattus. 

OX NCBI TaxID-10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE-SPINAL CORD; 

RX MEDLINE-98117249; PubMed-9458045; 

RA Kidd T., Brose K,, Mitchell R.J,, Fetter R.D., Tessier-Lavigne M., 

RA Goodman C.S., Tear G,; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily of evolutionarily conserved guidance receptors."; 

RL Cell 92:205-215(1998). 

DR EMBL; AF041082; AAC39960.1; -. 

DR HSSP; P56276; 1TLK. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

KW Transmembrane. 

SQ SEQUENCE 1651 AA; 180746 MW; FA2452DD46E186B7 CRC64; 



Query Match 40.0%; Score 911; DB 11; Length 1651; 

Best Local Similarity 23.0%; Pred. No. 4e-60; 

Matches 256; Conservative 57; Mismatches 117; Indels 684; Gaps 12; 

M 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 
¥ |:|| HUH || lllll:!|::llllllll II I :|| III 1 1 ! 1 1 : 1 1 : 1 

Db 360 QWALGRTVTFQCEATGNPQPAIFWRREGSQNLLFSYQPPQSSSRFSVSQTGDLTVTNVQ 419 

Qy 61 RSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKAT 120 

iii nun i mil: ii mill: imm m mmm i 1 11 

Db 420 RSDVGYYICQTLNVAGSIITKAYLEVTDVIADRPPPVIRQGPVNQTVAVDGTLTLSCVAT 479 

Qy 121 GDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 

I |:| I I 1:1 :| I II III: :: III III I:: lllhlll 
Db 480 GSPVPTILWRRDGVLVSTQDSRIRQLESGVLQ1RYAKLGDTGRYTCTASTPSGEATWSAY 539 

Qy 181 LDVTESGAT IS— RNYDLSDLPGPPSRPQVTDVTRNSVTLSWQPGTPGTLPASAYI IEAF 238 

::| I I : : I : :| I III : 1 1 1 1 : 1 1 : 1 1 1 III -llllll 
Db 540 IEVQEFGVPVQPPRPTDPNLIPSAPSKPEVTDVSRNTVTLLWQPNLNSGATPTSYIIEAF 599 

Qy 239 SQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAIN 277 

i : mini mi : mim mimi i 

Db 600 SHASGSSWQTVAENVKTETFAIRGLKPNAIYLFLVRAANAYGISDPSQISDPVRTQDVPP 659 

Qy 278 277 

Db 660 T TQG VDHKQVQRELGNWLHLHNPT ILSSSSVEVHWTVDQQ SQY IQGYKILY RPSG ASHG 719 



Qy 278 PK 279 

I 

Db 720 ESEWLVFEVRTPTKNSWIPDLRKGVNYEIKARPFFNEFQGADSEIKFARTLEERPSAPP 779 

Qy 280 VSVTQXK— ; 286 

III I ' 

Db 780 RSVTVSRNDGNGTAILVTWQPPPEDTQNGMVQEYKVWCLGNETRYHINRTVDGSTFSWI 839 



Qy 287 ■ 



Db 840 PFLVPGIRYSVEVAASTGAGPGVRSEPQFIQLDSHGNPVSPEDQVSLAQQISDWRQPAF 899 



Qy 289 ■ 



••mu- 
tt 



Db 900 IAGIGAACWIILMVFSIWLYRHRKKRNGLSSTYAGIRKVPSFTFTPTVTYQRGGEAVSSG 959 



■-STWAN- 

II I 



Qy 293 

Db 960 GRPGLLNISEPATQPWLADTWPNTGNSHNDCSINCCTASNGNSDSNLTTYSRPADCIANY 1019 

Qy 298 : 297 

Db 1020 NNQLDNKQTNLMLPESTVYGDVDLSNRINEMRTFNSPNLKDGRFVNPSGQPTPYATTQLI 1079 



il 

0 QANLINNMNNGGGDSSEKHWRPPGQQRQEVAPIQYNIMEQNRLNRDYRANDTILPTIPYN 1139 



Qy 300 LPPPPVQPLPGTEL 313 

Mill I I : 

Db 1140 HSYDQNTGGSYNSSDRGSSTSGSQGHRRGARTPKAPRQGGMNWADLLPPPPAHPPPHSNS 1199 

Qy 314 EHYAVEQQENGYDSDSWCPPLPVQTYLHQGLEDELEEDD-DRVPTPPVRGVASSP-AISF 371 

I I:: i: II : II I : II I Mill:: :l llllll II |:|: 
Db 1200 EEYSMSVDES-YDQEMPCPVPPARMYLQQ— DELEEEEAERGPTPPVRGAASSPAAVSY 1255 

Qy 372 GQQSTATLTPSPREEMQPMLQASP 395 

iiiiiiiiimmmi I 

Db 1256 SHQSTATLTPSPQEELQPMLQDCPEDLGHMPHPPDRRRQPVSPPPPPRPISPPHTYGYIS 1315 

Qy 396 395 

Db 1316 GPLVSDMDTDAPEEEEDEADMEVAKMQTRRLLLRGLEQTPASSVGDLESSVTGSMINGWG 1375 

Qy 396 XFTSS 400 

I :| 

Db 1376 SASEEDNISSGRSSVSSSDGSFFTDADFAQAVAAAAEYAGLKVARRQMQDAAGRRHFHAS 1435 
Qy 401 QRPRPTSPFSTDSNTSAALSQSQRPRPTKRHRGG 434 

i mm mil ill: i n ;ih i 

Db 1436 QC PRPTS PVST DSNMS AAVIQKARPT KKQ KHQ PG 1469 



RESULT 6 
Q9Z2I4 

ID Q9Z2I4 PRELIMINARY; PRT; 1344 AA. 

AC Q9Z2I4; 

DT 01-MAY-1999 (TrEMBLrel. 10, Created) 

DT 01-MAY-1999 (TrEMBLrel. 10, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE RIG-1 PROTEIN. , 

GN RBIG1, ■ 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentla; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBI_TaxID-10090; 

KN [1] 

RP SEQUENCE FROM N.A. 

RA Yuan S.-S.F., Cox L.A., Dasika G.R., Lee E.Y.-H.P.; 

RL Submitted (APR-1998) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AF060570;. AAD1162B.1; 

DR HSSP; P56276; 1TLR. 
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MGD; MGI:1343102; Rbigl. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 3. 
PFAM; PF00047; ig; 5. 
SEQUENCE 1344 AA; 143439 MW; 



8B0060341C49CFEA CRC64; 



Query Match 33.1%; Score 755.5; DB 11; Length 1344; 

Best Local Similarity 39.6%; Pred. No. 1.5e-48; 

Matches 176; Conservative 55; Mismatches 141; Indels 73; Gaps 9; 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

i ii i i:i mini ihiiiiiiii mi; 11 i in 1 1 n = : 

Db 334 QTVAPGANVSFQCETKGNPPPAIFWQKEGSQVLLFPSQSLQPMGRLLVSPRGQLNITEVK 393 

Qy 61 RSDAGYYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQ1LAVDGTALLKCKAT 120 

•I lll:lll::lllllllll II: I 1111111111111 : : I |: 
394 IGDGGYYVCQAVSVAGSILAKALLEIKGASIDGLPPIILQGPANQTLVLGSSVWLPCRVI 453 

Qy 121 GDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAV 180 
1:1 I I II: I I : : : III I ::: I I Mil II l||:|:: 
454 GNPQPNIQWKKDERWLQGDDSQFNLMDNGTLHIASIQEMDMGFYSCVAKSSIGEATWNSW 513 



181 LDVTES-GATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEAFS 239 

I I II: I: l!ll|:| M ||:||:|:| |::|:||||| 

514 LRKQEDWGASPGPATGPSNPPGPPSQPIVTEVTANSITLTWKPNPQSGATATSYVIEAFS 573 

240 QSVSNSWQTVANHVKTTLYTVRGLRPNT I YLFMVRAIN — PKVSVTQXKPQKNNGSTW 295 

I: |:|:IH: h II: 1 1 : 1 1 1 1 1 II : II I : : I I : I 
574 QAAGNTWRIVADGVQLETYTISGLQPNTirLFLVRAVGAWGLSEPSPVSEPVQTQDSS-- 631 

296 ANVPLPPPPVQPLPGTE-LEHYAVEQQE N 323 

II II I II II 
632 — LSRPAEDPWKGQRGIAEVAVRMQEPTVLGPRTLQVSWTVDGPVQLVQGFRVSWRIA 687 

324 GYDSDSW CPP LPVQT YLHQGLEDE — LEEDDDRVPT 357 

I I II II : II :H I : I: 

'688 GLDQGSWTMLDLQSPHKQSTVLRGLPPGAQIQIKVQVQGQEGLGAESPFVTRSIPEEAPS 747 
#> 

358 PPVRGVASSPAISFGQQSTATLTPS 382 

I :H I:: I :::l I 
748 GPPQGV — AVALGGDRNSSVTVS 768 



I 

DI 
DT 
DT 
DE 
GN 
OS 
OC 
OC 
OX 
RN 
RP 
RX 
RA 
RT 
RT 
RL 
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PRT; 1273 AA. 



SQ 



tESULT 
4928 

044928 PRELIMINARY; 
044928; 

Ol-JffN-1998 (TrEMBLrel. 06, ( 
01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 
01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 
SAX-3. 
SAX-3. 

Caenorhabditis elegans. 

Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 
Rhabditidae; Peloderinae; Caenorhabditis. 
NCBI_TaxID=6239; 
[1] 

SEQUENCE FROM N.A. 
MEDLINE-98117250; PubMed-9458046; 
Zallen J, A,, Yi B.A., Bargmann C.I.; 

"The conserved immunoglobulin superfamily member SAX-3/Robo directs 
multiple aspects of axon guidance in C. elegans."; 
Cell 92:217-227(1998). 
EMBL; AF041053; AAC3884B.1; -. 
HSSP; P56276; 1TLK. 
INTERPRO; IPR001777; 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 3. 
PFAM; PF00047; ig; 5. 
PRINTS; PR00014; FNTYPEIII . 

SEQUENCE 1273 AA; 139427 MW; 013E766B51A7BAD7 CRC64; 



Query Match . , 25.44; Score 579; DB 5; Length 1273; 

Best Local Similarity 40.84; Pred. No. 2.7e-35; 

Matches 124; Conservative 48; Mismatches 114; Indels 18; Gaps 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

I I I I II I I I II II III 1:1111: : I Mill III :: 
Db 326 QSVPAGGTATFECTLVGQPSPAYFWSKEGQQDLLFPSY-VSADGRTKVSPTGTLTIEEVR 384 

Qy 61 RSDAGYYICQALTVAGSILAKAQLEVTDVLTD RPPPIILQGPANQTLAVDGTALL 115 

: I I 1:1 : III |:|| |:|| :||| I I llll I :|:l 

Db 385 QVDEGAYVCAGMNSAGSSLSKAALKVTTKAVTGNTPAKPPPTIEHGHQNQTLMVGSSAIL 444 

Qy 116 KCKATGDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEA 175 

:|:| I I Ml:: | | : |:| | :!: I |||:| : ||: 
Db 445 PCQASGKPTPGISWLRDGLPIDITDSRISQHSTGSLHIADLKKPDTGVYTCIAKNEDGES 504 

Qy 176 SWSAVLDVTE • " SGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQ • PGTPGTLPASA 232 

:HI I I :- I I : I I: I |::| : :|| II I I I I I : 
Db 505 TWSASLTVEDHTSNAQFVRMPDPSNFPSSPTQPIIVNVTDTEVELHWNAPSTSGAGPITG 564 

Qy 233 YIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAIN PKVS— VT 283 

III: :l : :l : ::l :l I ::ll:|: l:|::ll I I II II 
Db 565 Y I IQY YSPDLGQTWFNIPDYVASTEYRI KGLKPSHSYMFVI RAENEKG IGT PS VS S ALVT 624 

Qy 284 QXKP 287 

II 

Db 625 TSKP 628 



044924 

ID 044924 « PRELIMINARY; PRT; 1395 AA. 

AC 044924; 

DT 01-JUN-1998 (TrEMBLrel, 06, Created) 

DT 01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

DT Ol-JUN-2000 (TrEMBLrel. 14, Last annotation update) 

DE ROUNDABOUT 1. 

GN ROB01. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI TaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-98117249; PubMed-9458045; 

RA Kidd T,, Brose S., Mitchell K.J., Fetter R.D., Tessier-Lavigne M. , 

RA Goodman C.S., Tear G.; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily. of evolutionarily conserved guidance receptors."; 

RL Cell 92:205-215(1998). 

DR EMBL; AF040989; :AAC38849.1; *. 

DR HSSP; P56276; 1TLK, 

DR FLYBASE; FBgn0005631; robo. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -, 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

DR PRINTS; PR00014; FNTYPEIII, 

SQ SEQUENCE 1395. AA; 151778 MW; B820E234A5218983 CRC64; 



Query Match 23,94; Score 545; DB 5; Length 1395; 

- .Best Local Similarity 40.94; Pred. No, l.le-32; 
Matches 115; Conservative 48; Mismatches 104; Indels 14; Gaps 

Qy 9 VTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAGYYI 68 

I II III hill III 1:111 : I I: II II:::: I III: 
Db 362 VQLPCMASGNPPPSVFWTKEGVSTLMFPN— SSHGRQYVAADGTLQITDVRQEDEGYYV 418 

Qy 69 CQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKATGDPLPVIS 128 
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I I :! I : hi: I Mllll lllllll I I hllhl I I 
Db 419 CSAFSVVDSSTVRVFLQVSSV-DERPPPIIQIGPANQTLPKGSVATLPCRATGNPSPRIK 477 

Qy 129 WLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYICVATSSSGEASWSAVIjDVTESGA 188 

I :| : I -\ ■ =1:: :|::||:||||| |: II Ml MM: 
Db 478 WFHDGHAVQAGN-RYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETSWAATLTVEKPGS 536 

Qy 189 T - ISKNYDLSDLPGPPSKPQVTDVTKNSVTLSW QPGTPGTLPASAYI IEAFSQSV 242 

I : : II III hi :|:: Ml I :|| I I I :| II : 
Db 537 TSLHRAADPSTYPAPPGTPKVLNVSRTSISLRWAKSQEKPGAVG--PIIGYTVEYFSPDL 594 

Qy 243 SNSWQTVMHVKTTLYTVRGLRPNTIYLFMVRAINPK-VSV 282 

I I: I I I: III I 1:1:111 I : :|| 
Db 595 QTGWIVAAHRVGDTQVTISGLTPGTSYVFLVRAENTQGISV 635 



RESULT 9 
Q9W213 

ID Q9W213 PRELIMINARY; PRT; 1395 AA. 

AC Q9W213; 

JT 01-MAY-2000 (TrEMBLrel. 13, Created) 

ma Ol-MAY-2000 (TrEMBLrel. 13, Last sequence update) 
Ol-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
ROBO PROTEIN. 

GN ROBO. 

OS Drosophila melanogaster (Fruit fly). 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera ; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBIJTaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BERKELEY; 

RX MEDLINE=20196006; PubMed-10731132; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M., Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A. , Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K,Y., Benos P. V., Berman B.P., Bhandari d., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J,, Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

i( RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

', RA Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C, Ferraz c, Ferriera S., Fleischmann W., 

«Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 
Glodek A., Gong P?, Gorrell J.H., Gu Z., Guan P., Harris M., 
Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A, , Rowland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M., Kalush F., Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S,, Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A.A., Li J,, Li Z. ( Liang Y., Lin X., 

RA Liu X., Mattei B., Mcintosh T.C, McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshxefi A., 

RA Mount S.M., Moy M., Murphy B., Murphy L. , Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K, ( Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri v., Reese M.G., 

RA Reinert K., Remington K,, Saunders R.D.C., Scheeler P., Shen H., 

RA shue B.C., Siden-Kiamos I., Simpson M., skupski M.P., Smith T., 

RA Spier e., Spradling A.C., Stapleton M., Strong R., Sun E., 

RA Svirskas R., Tector C, Turner R., Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A,, 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M., Zhang C, Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., zhu S., Zhu X,, Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003458; AAF46887.1; -. 



DR HSSP; P56276; 1TLK. 

DR FLYBASE; FBgn0005631; robo. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 3. 

DR PFAM; PF00047; ig; 5. 

DR PRINTS; PR00014; FNTYPEIII. 

SQ SEQUENCE 1395 AA; 151759 MW; 25CED7DEB44F13F0 CRC64; 



Query Match 23.9%; Score 545; DB 5; Length 1395; 

Best Local Similarity 40.9%; Pred. No. l.le-32; 



Matches 


Qy 


9 


Db 


362 


Qy 


69 


Db 


419 


Qy 


129 


Db 


478 


Qy 


189 


Db 


537 


Qy 


243 


Db 


595 



I II I 1 MM Ml |:||| 



I I: II II:::: I III: 



! I :| I ' 



I :| 



|:|: I MINI lllllll I I |:|||:| I I 



I =1 : :| = : MMIMIII |: II Ml II M = 



II' I II |:l M: |::| I :|| I I I :| II : 



I MM t : 1 1 I I IMIII I 



RESULT 10 
Q9VQ10 

ID Q9VQ10 PRELIMINARY; PRT; 823 AA. 

AC Q9VQ10; 

DT Ol-MAY-2000 (TrEMBLrel. 13, Created) 

DT Ol-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT Ol-OCT-2000 (TrEMBLrel, 15, Last annotation update) 

DE CG5481 PROTEIN. ' 

GN CG5481. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila, 

OX NCBI_TaxID-7227:; 

RN [1] 

RP SEQUENCE FROM Hi A. 

RC STRAIN-BERKELEY; 

RX MEDLINE=20196006; PubMed-10731132; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G.,; Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M., Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G,, Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P. v., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C, Busam D.A., Butler H,, Cadieu E., Center A., Chandra I,, 

RA Cherry J.M.',. Cavley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A. , Deng Z,„ Mays A.D., Dew I., Dietz S.M., 

RA Dodson'K,, Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K. J., 'Evangelista C.C, Ferraz C, Ferriera S., Fleischmann w., 

RA Fosler C, Gabrielian A.E., Garg N.S,, Gelbart W.M., Glasser K. , 

RA Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M., 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J,, 

RA Hostin D., Houston K.A., Howland T,J,, Wei M.-H., Ibegwam C, 

RA Jalali M., Kalush F., Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A., 
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RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A. A., Li J,, Li 2., Liang Y., Lin X., 

ra Uu x., Mattel B., Mcintosh i.e., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy C, Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskem D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K., Remington K., Saunders R.D.C, Scheeler F., Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M,, Skupski M.P., Smith T. , 

RA Spier E., Spradling A.C., Stapleton M,, Strong R., Sun E., 

RA Svirskas R., Tector c, Turner R., Venter E. , Wang A.H., Wang X., 

RA Wang z.*Y. f Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C, Wu D., Yang s., Yao Q.A., 

RA Ye J., Yeh r.-f., Zaveri J.S., zhan M., Zhang G., Zhao Q,, Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

•Science 287:2185-2195(2000). 
EMBL; AE003586; AAF51373.1; -. 
HSSP; P56276: 1TLK, 

DR FLYBASE; FBgn0031341; CG5481. 

DR INTERPRO; IPR001412; -. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; 

DR PFAM; PF00041; fn3; 1. 

DR PFAM; PF00047; ig; 5. 

DR PRINTS; PR00014; FNTYPEIII. 

DR PROSITE; PS00178; AA_TRNA_LIGASE_I; UNKNOWNJ. 

SQ SEQUENCE 823 AA; 89715 MW; 36FC0B91F36F2F19 CRC64; 



Query Match 21,0%; Score 478; DB 5; Length 823; 

I Best Local Similarity 29.4%; Pred. No. 6.2e-28; 
Matches 141; Conservative 66; Mismatches 181; Indels 92; Gaps 18; 
Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGD--LTITN 58 

1:1 I I I I: hl:| ::| II: :|| I : :::| I |:| 
Db 285 QLVEIGDEVLFECQANGHPRPTLYWSVEGNSSLLLPGY-RDGRMEVTLTPEGRSVLSIAR 343 

Qy 59 IQRSDAGYYI-CQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKC 117 
I 1:1 : I II II: :: : I I : lllll III Ml I :| I 
344 FAREDSGKWTCNALNAVGSVSSRTWSV-DTQFELPPPIIEQGPVNQTLPVKSIWLPC 402 



. Db 

Qy 

4 Db 



118 KATGDPLPVISWLKEGFTFPGRD-PRATIQEQGTLQIKNL-RISDTGTYTCVATSSSGEA 175 

: I hi :H :| :: I : : I II :| I I I lllll:: :|:: 
403 RTLGTPVPQVSWYLDGIPIDVQEHERRNLSDAGALTISDLQRHEDEGLYTCVASNRNGKS 462 

176 SWSAVLDV- - -TESGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSW-QPGTPGTLPAS 231 

1111:1 : HI llll III: : :|llllll : I 
463 SWSGYLRLDTPTNPNIKFFRAPELSTYPGPPGKPQMVEKGENSVTLSWTRSNKVGGSSLV 522 

232 AYIIEAFSQSVSNSWQTVANHVKTTLYTVRGLRPNTIYLFMVRAINPKVSVTQXKPQKNN 291 

\ 1:11 I :: :: I I |: I :| II I I |::|| I 
523 GYVIEMFGKNETDGWVAVGTRVQNTTFTQTGLLPGVNYFFLIRAENSH 570 

292 GSTWANVPLPPPPVQPLP-GT ELEHYAVEQQENGYDSDSWCPPLPVQ 337 

: II I :|: II III llll 
571 GLSLPSPMSEPITVGTVSSENHSFTLMFPSLIHY FDSLFHP---Q 612 

338 TYLHQGLEDELEEDDDRVPTPPV RGVASSPAISFGQQSTATLT — PSPREEMQ 388 

I : II :| I : : I II:: I II :l 
613 RYFNSGL--DLSEARASLLSGDWELSNASWDSTSMKLTWQVCNRLTDGSIAAPHSIAH 670 

389 PMLQASPXFT SSQRPRP TSPFSTDSNTSAALSQS 422 

III : I I I hi -III: I I 

671 RHLIRSASFLMQIINGKYVEGFYVYARQLPNPIVNNPAPVTSNTNPLLGSTSTSASASAS 730 



RESULT 11 
Q9VPZ6 

ID Q9VPZ6 PRELIMINARY; PRT; 859 AA. 
AC Q9VPZ6; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 



dt 01-MAY-2000 (TrEMBLrel, 13, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE CG5423 PROTEIN (FRAGMENT) . 

GN CG5423. 

OS Drosophila melanogaster (Fruit fly), 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI TaxID-7227V 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BERKELEY ; - 

RX MEDLINE-20196006.; PubMed-10731132; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G.,; Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A., Levis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G.G., Workman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle ,C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J.-, Andrews -Pfannkoch C, Baldwin D., 

RA Ballew R.M., Ba's,u A., Baxendale J., Bayraktaroglu l., Beasley E.M., 

RA Beeson K.Y., Ben'os P. V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C, Busam D.A., Butler H., Cadieu E,, Center A. , Chandra I., 

RA Cherry J.M., Cawley S,, Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup.L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz c, Ferriera S., Fleischmann w., 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., 

RA Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M. , 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J,, 

RA Hostin D., Houston K.A., Rowland T.J., Wei M.-H,, Ibegwam C, 

RA Jalali M., Kalush F., Karpen G.H., Ke z., Kennison J. A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A.A., Li J., Li Z., Liang Y. , Lin X., 

RA Liu X., Mattel 3,, Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G,, Milshina N.V., Mobarry c, Morris J., Moshrefi A., 

RA Mount S.M., Moy M,, Murphy B., Murphy L,, Muzny D.M., Nelson D,L, , 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K., Remington K., Saunders R.D.C, Scheeler F., Shen H, , 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T. , 

RA Spier E,, Spradling A.C, Stapleton M., Strong R, , Sun E., 

RA Svirskas R., Tector C, Turner R., Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C./ Wu D., Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F;, Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003586;- AAF51388.1; -. 

DR HSSP; P56276; 1TLK. 

DR FLYBASE; FBgn0031328; CG5423. 

DR INTERPRO; IPR001777; ■ 



10; IPR003006; 
PFAM; PF00041; fn3; 3 
PFAM; PF00047; ig; 5. 
N0N.TER 1 ' 
SEQUENCE 859 AA; 9! 



;916 MW; 5CFD69D984101BF8 CRC64; 



Query Match 18.3*; Score 417; DB 5; Length 859; 

Best Local Similarity 33.7%; Pred. No. 2.5e-23; 

Matches 96; Conservative 48; Mismatches 127; Indels 14; Gaps 

3y 3 VAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGD — LTITN 58 

II :l |||:| :|| : : |;|| 'I I : ||;| 

Db 269 VELGADTSFECRAIGNPKPTIFWTIKNNSTLIFPGAP--PLDRFHSLNTEEGHSILTLTR 326 

2y 59 IQRSDAGYYI : CQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKC 117 

11:1 l'l I: II :: II : I lllllll: II llll : I 1:1 
Db 327 FQRTDKDLVILCNAMNEVASITSRVQLSL-DSQEDRPPPIIISGPVNQTLPIKSLATLQC 385 
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Qy 


118 


Db 


386 


Qy 


177 


Db 


445 


Qy 


233 


Db 


505 



II II I III ::( 



II: 



I I I I :| I I I lllh :!: 



I I :!:: : I :::|: 



DT 



I :| : : I M :|: : ::ll I 1 : 1 : 1 1 1 I 



RESOLT 12 # 
001632 

ID 001632 • PRELIMINARY; PET; 874 AA, 
AC 001632; . 

dt 01-JUL-1997 (TrEMBLrel. 04, Created) 
DT 01-JUL-1997 (TrEMBLrel. 04, Last sequence update) 
DT 01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 
CODED FOR BY C. ELEGANS CDNA CEESC12R. 
ZK377.2. , 
is Caenorhabditis elegans. 
OC Eukaryota; 'Metazoa; Neniatoda; Chxomadorea; Rhabditida; Rhabditoidea; 
OC Rhabditidae; Peloderinae; Caenorhabditis. 
OX HCBI_TaxID-6239; 
RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL N2; 

RX MEDLINE-94150718; PubMed-7906398; 

RA Wilson R.,jAinscough R., Anderson K., Baynes 0., Berks M., 

RA Bonfield J)., Burton J., Connell M., Copsey T., Cooper J., Coulson a. # 

RA Craxton M.i, Dear S., Du Z., Durbin R,, Favello A., Fulton L., 

RA Gardner A.', Green P., Hawkins T., Hillier L., Jier M., Johnston L., 

Jones M. ( -Kershaw J., Kirsten J., Laister N., Latreille P., 
RA Lightning .jr., Lloyd C, Mcmurray A., Mortimore B., Q'Callaghan M. , 
RA Parsons J:, Percy C, Rifken L., Roopra A., Saunders D., Shownkeen R., 
RA Smaldon n;, Smith A,, Sonnhanuner E., Staden R,, Sulston J., 
RA ThierryMieg J., Thomas K., Vaudin M,, Vaughan K., Waterston R., 
RA Watson A., Weinstock L., Wilkinson -Sproat J., Wohldman P.; 
RT "2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 



RA 



f 



Nature 368:32-38(1994). 
[2] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
Nhan M. , Hawkins J.; 

Submitted (PEB-1997) to the EMBL/GenBank/DDBJ databases. 
[3] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
Waterston R.; 

Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 
[4] 

SEQUENCE FROM N.A. 
STRAIN-BRISTOL N2; 
Waterston Re- 
submitted (APR-1997) to the EMBL/GenBank/DDBJ databases. 
EMBL; U88183; AAB52657.1; -. 
HSSP; P56276; 1TLK. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; .-. 
PFAM; PF00041; fn3; 3. 
PFAM; PF00047; ig; 1. 
PRINTS; PRQ0Q14; FNTYPEIII. 

874 AA; 95861 MW; BC72270818D734C9 CRC64 ; 



Query Match 17.0*; Score 388; DB 5; Length 874; 

Best Local Similarity 38.84; Pred. No. 3.9e-21; 

Matches 83; Conservative 38; Mismatches 81; Indels 12; 



flv 
«J 


86 


Db 


20 


Ov 


146 


Db 


80 


Qy 


204 


Db 


140 


Qy 


263 


Db 


200 



VTDVLTDRPPPIILQGPANQTLAVDGTALLKCKATGDPLPVISWLKEGFTFPGRDPRATI 145 
II :lll I I Nil I :|:| I : I : I I I Illl::| I I : 



1:1 I :h III llhl : IhHIl I I : I I : I |: I I 
HSTGSLHIADLKKPDTGVYTCIAKNEDGESTWSASLTVEDHTSNAQFVRMPDPSNFPSSP 139 



:| = :ll I I I II I I : III: =1 



|:|: I:h!ll I 



- -PKVS — VTQXKP 287 

I II II II 



RESULT 13 

Q9U1M1 i 

ID Q9U1M1 PRELIMINARY; PRT; 2221 AA. 

AC Q9U1M1; Q9UB11; Q9W5D9; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-OCT-2000 (TrEMBLrel, 15, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE SDK PROTEIN (EG; BACR19J1 . 1 PROTEIN). 

GN SDK OR EG:BACR19J1.1. 

OS Drosophila melanogaster (Fruit fly). 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID-7227; 

RN (1] 

RP SEQUENCE FROM NiA. 

RC STRAIN-OREGON-R; 

RA Borkova D., Minana B., Kafatos F.C.; 

RT "Sequencing the distal X chromosome of Drosophila melanogaster."; 

RL Submitted (NOV-1999) to the EMBL/GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM HI A. 

RC STRAIN-OREGON-R; 

RA Benos P.; 

RL Submitted (DEC-1999) to the EMBL/GenBank/DDBJ databases, 

RN [3] 

RP SEQUENCE FROM N;A. 

RC STRAIN-BERKELEY, 1 ' 

RX MEDLINE-20196006; PubMed-10731132; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G./ Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle-.C, Baxter E.G., Belt G,, Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews -Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L,, Beasley E.M., 

RA Beeson K.Y., Benos P. v., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J,, Brokstein P., Brottier P., 

RA Burtis K.C, Busam D.A., Butler H, , Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng z., Mays a.d,, Dew I., Dietz S.M., 

RA Dodson K., Doup'L.E., Downes M. , Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C, Ferraz C, Ferriera S., Fleischmann w., 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K,, 

RA Glodek A., Gong'.F., Gorrell J.H., Gu Z., Guan P., Harris M. , 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A., Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M., Kalush F., Karpen G.H., Ke z., Kennison J. A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A.A., Li J., Li Z., Liang Y., Lin X., 

RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.v,, Mobarry c, Morris J. # Moshrefi A., 

RA Mount S.M., Moy M . , Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K,, Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri v., Reese M.G., 
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ra Reinert k, , Remington K., Saunders R.D.C., Scheeler P., Shen H, , 

RA Shue B.C., Siden-Kiamos I., Simpson M. , Skupski M.P., Smith T. , 

RA Spier E., Spradltng A,C, Stapleton M. , Strong R., Sun E., 

RA Svirskas R., lector C, Turner R,, Venter E,, Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J,, 

RA Williams S.M., Woodage T,, Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J,, Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N,, zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

RN [4] 

RP SEQUENCE OF 1704-2221 FROM N.A. 

RC STRAIN=BERKELEY; 

RA Rubin G.M., Wan K.H., Harvey D., Jewis S.E., Brokstein P., Tsang G., 

RA Agbayani A., Arcaina T.T., BaxteriE., Blazej R.G., Butenhoff C, 

RA Champe M, , Chavez C, Chew M., Doyle CM., Farfan D.E., Frise E., 
Galle R,, George R,A. , Harris N.Li, Hoskins R.A., Evans-Holm M., 
Houston K.A., Hummasti S.R., Kim 1, Li P., Moshrefi M., Pacleb J,M,, 
Park s,, Sequeira A., Sethi H., snir E., Svirskas R.R., Weinburg T., 

RA Celniker S'.E.; 

■RL Submitted (MAR-1999) to the EMBL/GenBank/DDBJ databases. 

IDR EMBL; AL132792; CAB65848.1; -. 1 

' DR EMBL; AE003418; AAF45541.1, 

DR EMBL; AF132195; AAD34783.1, 

DR HSSP; P56276; 1TLK. 

DR FLYBASE; FBgn0021764; sdk. 

DR INTERPRO; IPR001777; -. 

iDR INTERPRO; IPR003006; -. 

IDR PFAM; PF00041; fn3; 13. 

IDR PFAM; PF00047; ig; 5. 

IDR PRINTS; PR00014; FNTYPEIII 

' KW Polymorphism. 

' FT VARIANT 51 51 Y ->]CYAD (IN STRAIN OREGON'S 

SQ SEQUENCE 2221 AA; 245963 MW; 67451AD6A57F06F0 CRC64; 



Query Match f 14.8%; Score 336.5; DB 5; Length 2221; 

Best Local Similarity 28.04; Pred! No. 9.2e-17; 

Matches 109 ; ! Conservative 56; Mismatches 141; Indels 83; Gaps 16; 

6 GRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAG 65 
I: If I 1:1 I : I :] I :lh : :|l| |:||: III 
466 GKDATISCRAVGSPNPNITW IlNETQLVDISSRVQILESGDLLISNIRSVDAG 518 

' 66 YYICQALTVAGSIIAKAQLEVTDVITDR^PPIILQGPANQTIjAVDGTALLKCKATGDP-L 124 

III ' 1 1 If :| I I I I ? 1:1 I :1: : II |:|| : II : 
519 LYICVRANEAGSVKGEAYLSVL-VRTQ- - -IIQPPVDTTVLLGLTATLQCKVSSDPSV 572 



Qy 



Qy 125 PV-ISWLKEG--FTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVL 181 

I I I :H I I :| 1*1:1= :l II 'hi II II I : :| I 
Db 573 PYNIDWYREGQSSTPISNSQRIGVQADGQLEIQAVRASDVGSYACVVTSPGGNETRAARL 632 

Qy 182 DVTESGATISKNYDLSDLPGPPSKPQV— TDVTKNSVTLSWQPGTPGTLPASAYII-" 235 

I I II III :l : : I: :|l II I I I :ll 

Db 633 SVIE LPFPPSNVKVERLPEPQQRSINVSWTPGFDGNSPISKFIIQRR 679 

Qy 236 EAFSQSVSN- - - SWQTVANHVKT • -TLYTVRGLRPNTIYLFMVRAINPKVSVTQX 285 

I I I : :| I ::| : h hi I I hi I : 

Db 680 EVSELERFVGPVPDPLLNWITELSNVSADQRWILLENLKAATVYQFRVSAVN— RVGEG 736 

Qy 286 KPQKNNGSTWANVPL PP PPVQPLPGTELEHYAVEQQE 322 

I : I :| II II:: :: | : : 

Db 737 SP - - SEPSNWELPQEAPSGPPVGFVGSARSMSEI ITQWQPPLEEHRNGQILGYILRYRL 794 

Qy 323 NGYDSDSWCPPLPVQTYLHQGLEDELEED 351 

II:: I :| : :| : : 

Db 795 FGYNNVPWS YQNITNEAQRN 814 



RESULT 14 
097394 

ID 097394 PRELIMINARY; 



AC 097394; 

DT 01-MAY-1999 (TrEMBLrel. 10, Created) 

DT 01-MAY-1999 (TrEMBLrel. 10, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE SIDEKICK PROTEIN, 

GN SDK. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID=7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Nguyen D.N.T., Liu Y., Litsky M.L., Reinke R.; 

RT "Sidekick, a member of the immunoglobulin superfamily, is required for 

RT pattern formation in the Drosophila eye."; 

RL Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; U88578; AAD09632.1; -. 

DR HSSP; P56276; 1TLK. 

DR FLYBASE; FBgn0021764; sdk. 

DR INTERPRO; IPR001777; -. 

DR INTERPRO; IPR003006; -. 

DR PFAM; PF00041; fn3; 13. 

DR PFAM; PF00047; ig; 5, 

DR PRINTS; PR00014 ; FNTYPEIII. 

2222' AA; 246174 MW; 18853CCAF98D3BC2 CRC64; 



Query Match 14.5%; Score 331.5; DB 5; Length 2222; 

Best Local Similarity 27.3%; Pred. No. 2.2e-16; 

Matches 107; Conservative 58; Mismatches 138; Indels 89; Gaps 15; 

Qy 6 GRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQRSDAG 65 

I: I I hi I : I : I :M : =111 hlh II 
Db 469 GKDATISCRAVGSPNPNITW IYNETQLVDISSRVQILESGDLLISNIRSVDAP 521 

Qy 66 YYICQALTVAGSILAKAQLEVTDVLTDRPPPIILQGPANQTLAVDGTALLKCKATGDP-L 124 

III III: hi I I II I : I I : I : : II hll : II : 
Db 522 LYICVRANEAGSVKAEAYLSVL-VRTQ IIQPPVDTTVLLGLTATLQCKVSSDPSV 575 

Qy 125 PV-ISWLKEG-FTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSAVL 181 

I I I :T I 1:11 hh :l II hi II II I : :| I 
Db 576 PYNIDWYREGQSSTPISNSQRIGVQADGQLEIQAVRASDVGSYACWTSPGGNETRAARL 635 

Qy 182 DVTESGATISKNYDLSDLPGPPSKPQV— TDVTKNSVTLSWQPGTPGTLPASAYII— 235 

II II III :| : : h =11 II I I I :|l 

Db 636 SVIE - LPFPPSNVKVERLPEPQQASINVSWTPGFDGNSPISKFIIQRR 682 

Qy 236 EAFSQSVSN- - -SWQTVANHVKT- -TLYTVRGLRPNT I YLFMVRAIN 277 

I I 'I : :| I ::| : h hi I I hi 

Db 683 EVSELEKFVGPVPDPLLNWITELSNVSADQRWILLENLKAATVYQFRVSAVNRVGEGSPS 742 

Qy 278 PKVSVTQXKPQKNNGSTWANVPLPP PPVQPLPGTELEHYAVE 319 

:| : ; : ::| II lh: :: I : 

Db 743 EPSNWELPQEAHSG PPVGFVGSARSMSEIITQWQPPLEEHRNGQILGYILR 794 

Qy 320 QQENGYDSDSWCPPLPVQTYLHQGLEDELEED 351 
: lh: I :| : :| : : ' 

Db 795 YRLFGYNNVPWS YQNITNEAQRN 817 



15 



Q9IAJ0 



Q9IAJ0 PRELIMINARY; PRT; 1788 AA. 
Q9IAJ0; * 

01:OCT-2000 (TrEMBLrel, 15, Created) 
01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 
01-OCT-2000 (TrEMBLrel , 15, Last annotation update) 
RECEPTOR PROTEIN TYROSINE PHOSPHATASE LAR, 
XPTP-LAR. 

Xenopus laevis (African clawed frog). 

Eukaryota; Metazoa; Chordata; Craniata; vertebrata; Euteleostomi; 

Amphibia; Batrachia; Anura; Mesobatrachia; Pipoidea; Pipidae; 
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Xenopodinae; Xenopus. 
NCBI TaxID-8355; 
11] 

SEQUENCE FROM N.A. 
MEDLINE=2Q193505; PubMed-10727868; 
Johnson K.G., Holt C.E.; 

"Expression of CRYP-alpha, lar, PTP-delta, and PTP-rho in the 
developing xenopus visual system."; 
Mech. Dev. 92:291-294(2000). 
EMBL; AF197945; AAF43606.1; -, 

SEQUENCE 1788 AA; 200270 MW; AB192549866D9067 CRC64; 



Query Match 14.5*; Score 330; DB 13; Length 1788; 

Best Local Similarity 29.84; Pred. No. 2.2e-16; 

Matches 108; Conservative 47; Mismatches 154; Indels 54; Gaps 15; 

Qy 1 QIVAQGRTVTFPCETKGNPQPAVFWQKEGSQNLLFPNQPQQPNSRCSVSPTGDLTITNIQ 60 

::| : II I I III I : I I: | || :| | | | : 

Db 141 KWEKTRTATMLCAASGNPDPEITWFKD FLPVDTASSNGRIKQLRSGALQIENSE 195 

t61 RSDAGYYICQALTVAGS-ILAKAQLEWDVLTDRPPPIILQGPANQTLAVDGTALLKCKA 119 
II II I I II: I I I I II h! : I: I I I 
196 ESDQGKYECVATNSAGTRYSAPANLYVR- - -VRRVAPRFSIPPSNHEVMPGGSVNLTCVA 252 

Qy 120 TGDPLPVISWLKEGFTFPGRDPRATIQEQGTLQIKNLRISDTGTYTCVATSSSGEASWSA 179 

I 1:1 : I: I :: : I I:: I I |: lllll II I I 
Db 253 VGAPMPYVKWM' AGLEELTKEDEMPVGRNG -LELTN- - IKDSANYTCVAISSLGMI • ■ EA 306 

Qy 180 VLDVTESGATISKNYDLSDLPGPPSKPQVTDVTKNSVTLSWQPGTPGTLPASAYIIEAFS 239 

I :| : II II II: I llll:l II II hi: 

Db 307 VAQIT VKALPKPPLDAMVTETTATSVTLTWDSGNPD—PVSYYVIQYKP 353 

Qy 240 QSVSNSWQTVANHVKTTLYTVRGLRPNT I YLFMVRAIN PKVSVTQXKPQKNNGST 294 

:: :|:| I : I 1 1 I : : 1 1 I : I I : I : I I I : : : |: 
Db 354 KASESSFQEV-DGVATTRYSIGGLSPFSEYEFRIIAVNNIGRGPPSEVIEAQTGEQAPSS 412 

Qy 295 WANVPLPPPPVQP- ■ -LPGTELEHYAVEQQENG YDSDSWCPPLPVQTYLHQG 343 

Mil I I : : :: II I :| I II : 

Db 413 PPLKVQARMLSASTMLVQWDLPEEPNGQIRGFRVYYTTD- • -PHLPFSMWQKHN 463 



Db 464 WD 466 



Search completed: January 22, 2001, 12:54:12 
Job time: 2053 sec 
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GenCore version 4.5 
Copyright (c) 1993 • 2000 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on: January 22, 2001, 12:19:47 ; 



Search time 233,01 Seconds 

(without alignments) 

21.719 Million cell updates/sec 



Title; US-09-540-245A-20 
Perfect score: 761 

Sequence: 1 AQAVAAAAEYAGLKVARRQM REALDGRQVTDLRTNPSDPR 148 

Scoring table; BLOSUM62 

Gapop 10.0 , Gapext 0.5 



^irched: 268485 seas, 34193795 residues 

Total number of hits satisfying chosen parameters: 



268485 



Minimum E 
Maximum E 



seq length: 0 

seq length: 2000000000 



Post-processing: Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

Database : A_GeneseqJ6:* 

1 : /S IDS 1/gcgdata/geneseq/geneseqp/AAl 9 80 , DAT : * 

2 : /SIDSl/gcgdata/geneseq/geneseqp/AA1981 .DAT : * 

3 : /SIDSl/gcgdata/geneseq/geneseqp/AAl982 .DAT : * 

4 : /S IDS 1/gcgdata/geneseq/geneseqp/AAl 983 .DAT : * 

5: /SIDSl/gcgdata/geneseq/geneseqp/AA1984.DAT:* 

6 : /SIDSl/gcgdata/geneseq/geneseqp/AAl985 .DAT : * 

7: /SIDSl/gcgdata/geneseq/geneseqp/AA1986.DAT:* 

8 ; /SIDSl/gcgdata/geneseq/geneseqp/AA1987 .DAT : * 

9: /S IDS 1/gcgda ta/genes eq/ geneseqp/Ml 9 88 . DAT : * 

10 : /SIDSl/gcgdata/geneseq/geneseqp/AAl989 .DAT : * 

11: /SIDSl/gcgdata/geneseq/geneseqp/AA1990.DAT:* 

12: /SIDSl/gcgdata/geneseq/geneseqp/AAl991 .DAT : * 

13: /SIDSl/gcgdata/geneseq/geneseqp/AA1992.DAT:* 

14: /SIDSl/gcgdata/geneseq/geneseqp/AA1993.DAT:* 

15: /SIDSl/gcgdata/geneseq/geneseqp/AA1994.DAT:* 

• 16: /SIDSl/gcgdata/geneseq/geneseqp/AA1995.DAT:* 

17: /SIDSl/gcgdata/geneseq/geneseqp/AAl996.DAT:* 

18: /SIDSl/gcgdata/geneseq/geneseqp/AA1997.DAT:* 

19: /SIDSl/gcgdata/geneseq/geneseqp/AAl998.DAT:* 

20: /SIDSl/gcgdata/geneseq/geneseqp/AA1999.DAT: * 

21: /SlDSl/gcgdata/geneseq/geneseqp/AA2000.DAT:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



Result Query 

No. Score Match Length DB ID 



Description 



1 


761 


100.0 


148 


20 


Y13568 


2 


761 


100.0 


148 


20 


Y08406 


3 


691 


90.8 


1649 


20 


Y08404 


4 


691 


90.8 


1651 


20 


Y13566 


5 


108 


14,2 


434 


20 


Y13567 


6 


•108 


14.2 


434 


20 


Y08405 


7 


101 


13.3 


515 


19 


W72076 


8 


99.5 


13.1 


308 


13 


R22248 


9 


97.5 


12,8 


308 


20 


Y25601 


10 


97.5 


12.8 


414 


19 


W72159 


11 


97.5 


12.8 


414 


19 


W72139 


12 


88.5 


11.6 


821 


14 


R35451 



Mouse Robo 1 polyp 
Mouse partial ROBO 
Human ROBOl protei 
Human Robo 1 polyp 
Human Robo 2 polyp 
Human partial ROBO 
HSV-2 strain SB5 C 
Sequence of rye-gr 
Lolium sp. allerge 
HSV-2 strain SB5 C 
HSV-2 strain SB5 C 
Mouse epsB . Mus m 



13 


88 1] 


.6 980 ! 


y wby/4i 


SAPAP2 protein. H 


14 


88 1] 


.6 980 '. 


9 W69743 


SAPAP1 protein. H 






'1 ^c?! • 


1 V100A1 


Calcium channel al 


If 

16 


84 1] 


.0 541 


y wj/i4b 


Mammalian Ena (Men 


17 


83 1C 


.9 1319 ! 




Mammalian son of s 


18 


83 1( 


.9 1336 '. 


6 R84638 


mSOSl protein. Mu 


19 


82 1( 


.8 903 ; 




Zebrafish differen 




81.5 1( 


.7 449 '. 


y WiJjoDD 


Amino acid sequenc 


21 


ir 


'\ ill! ■ 


y wsyoos 


Amino acid sequenc 


22 


ai M 




5 W44O04 


Human TPC2 telomer 




81 1( 


c line '■ 
.6 1105 . 




Human TPC2 protein 


24 


80.5 1( 


.6 197 2 


1 Y75526 


Neisseria meningit 


25 


80.5 1( 


.6 224 \ 


U I Jo/ /o 


Chlamydia pneumoni 


26 


80.5 1( 




o wmihh 


Human ETS2 repress 


27 


80 1( 


.5 127 ' 


3 WJiiOj 


Neospora caninum a 


28 


80 1( 






Soybean beta-carot 


29 


80 1( 


'e iof ' 

.5 386 . 


A VCAOOI 

u iayboi 


Human normal uteru 


30 


79.5 1( 


.4 384 '. 




Bordetella pertuss 






'] |^ \ 




Human TIE ligand N 


32 


79.5 1( 


.4 493 i 


1 Y70745 


PSEQ-3 protein enc 


33 


79.5 1( 


.4 493 2 




Human scarface 1 p 


34 


79 1( 


.4 118 ] 


3 W19831 


Plasmid pSP2alpha 


35 


79 1( 


.4 256 \ 


A VAAAT3 


Alternatively spli 


36 


79 1( 


.4 276 \ 


3 Y00922 


Human CLARl protei 


37 


79 1( 






CMV Colburn region 


38 


78.5 ' 10 


'.3 350 2 


0 Y35922 


Extended human sec 


39 


78.5 10 


.3 543 1 


8 W07702 


Mouse ETS2 repress 


40 


78.5 10 


.3 783 1 


9 W37151 


Mouse neural Mena+ 


41 


78.5 10 


.3 787 1 


9 W37152 


Mouse neural Mena+ 


42 


78.5 10 


.3 802 1 


9 W37153 


Mouse neural Mena+ 


43 


78.5 10 


.3 1784 1 


6 R77223 


Tuberous sclerosis 


44 


78 10 


.2 1333 2 


1 Y68820 


Amino acid sequenc 


45 


77.5 10 


.2 390 2 


0 Y35923 


Extended human sec 



RESULT 1 
Y13568 

ID Y13568 standard; Protein; 148 AA. 
XX 

AC Y13568; 
XX 

DT 30-JUL-1999 (first entry) 
XX 

DE Mouse Robo 1 polypeptde. 

XX 

KW Comm polypeptide; Robo polypeptide; commissureless; roundabout; 
KW modulation; nerve cell function, 
XX 

OS Mus Sp. 
XX 

PN W09925833-A1. 
XX 

PD 27-MAY-1999. 
XX 

PP 13-NOV-1998; 98WO-OS24327. 
XX 

PR 14-NOV-1997; 97DS-0065543. 
XX 

PA (REGC ) DNIV CALIFORNIA. 
XX 
PI 
XX 

DR WPI; 1999-338009/28. 
DR N-PSDB; X55772. 
XX 

PT Modulation of Robo-Comm polypeptide interactions 
XX 

PS Disclosure; Page 50; 56pp; English, 
XX 

CC The invention relates to a method for modulating the amount of Comm 
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CC (conroissureless) polypeptide in contact with a cell expressing active 

cc Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Comm polypeptide in contact with the cell, where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Comm in contact with the cell , 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo: Comm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function , 
XX 

SQ Sequence 148 AA; 



Query Match 100.0%; Score 761; DB 20; Length 148; 

Best Local Similarity 100.0%; Pred. No. 4.8e-71; 
. Matches 148; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 AQAVAAMEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAVVIQKARPAKK 60 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIIIIMIIIIIII! 
Db 1 aqavaaaaeyaglkvarrqmqdaagrrhfhasqcprptspvstdsnmsawiqkarpakk 60 




61 QKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPWPKLASIEARTDRSSD 120 

llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
61 qkhqpghlrreayaddlppppvpppaiksptvqskaqlevrpvmvpklasieartdrssd 120 



Qy 121 RKGGSYKGREALDGRQVTDLRTNPSDPR 148 

IIIIMIlllllllllllllllllllll 
Db 121 rkggsykgrealdgrqvtdlrtnpsdpr 148 



RESULT 2 
Y08406 

ID Y08406 standard; Protein; 148 AA. 
xx 

AC Y08406; 
XX 

DT 24-JOL-1999 (first entry) 
XX 

DE Mouse partial ROBOl protein. 
XX 

kw ROBOl; ROB02; roundabout; nerve guidance; human; murine; cell function; 

KW cell morphology; screening assay. 

XX 

OS Mus sp. 
XX 

PN WO9920764-A1. 
XX 

PD 29-APR-1999. 
XX 

I20-OCM99B; 98WO-US22164. 
14-NOV-1997; 97OS-0971172. 
PR 20-OCM997; 97US-0062921. 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Goodman CS, Kidd T, Mitchell KJ, Tear G; 

XX 

DR WPI; 1999-312615/26. 
DR N-PSDB; X57255. 
XX 

PT Robo polypeptides, a new immunoglobulin superfamily member 

XX 

PS Claim 1; Page 74-75; 80pp; English. 
XX 

CC This invention describes novel Robo (roundabout) polypeptides, involved 

CC in nerve guidance which heve been isolated from Drosophila sp., 

CC C. elegans, human and murine samples. The products of the invention can 

CC be used to raise anti-Robo antibodies, which can be used to modulate cell 

CC function or morphology. The Robo polynucleotides and fragments are useful 

CC as probes and primers and for production of the Robo polypeptides . The 

CC probes and primers are also useful in screening assays. 

XX 



SQ Sequence 148 AA; 



Query Match 100.0%; Score 761; DB 20; Length 148; 

Best Local Similarity 100.0%; Pred, No. 4.8e-71; 

Matches 148; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKK 60 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIMIIIMIIIIIIIIIIIIIIIIIIIII 
Db 1 aqavaaaaeyaglkvarrqmqdaagrrhfhasqcprptspvstdsnmsavviqkarpakk 60 

Qy 61 QKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDRSSD 120 

llllllilllillllllllinilllllllllllllllllllllllllllllllllllll 
Db 61 qkhqpghlrreayaddlppppvpppaiksptvqskaqlevrpvmvpklasieartdrssd 120 

Qy 121 RKGGSYKGREALDGRQVTDLRTNPSDPR 148 

nimiimiiiiiiiiMiiiiiii 

Db 121 rkggsykgrealdgrqvtdlrtnpsdpr 148 



RESULT 3 
Y08404 

ID Y08404 standard; Protein; 1649 AA. 
XX 

AC Y08404; i 
XX 

DT 24-JUL-1999 (first entry) 
XX 

DE Human ROBOl protein. 

XX 

KW ROBOl; ROBQ2; roundabout; nerve guidance; human; murine; cell function; 

KW cell morphology; screening assay. 

XX , 

OS Homo sapiens; : . 

XX 

PN WO9920764-A1. , 
XX 

PD 29-APR-1999. •. 
XX 

PF 20-OCT-1998; 98WO-US22164. 
XX 

PR 14-NOV-1997; 97US-0971172 . 

PR 20-OCH997; 97US-0062921. 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Goodman CS, Kidd T, Mitchell KJ, Tear G; 
XX 

DR WPI; 1999-312615/26. 

DR N-PSDB; X08404. 
XX 

PT Robo polypeptides, a new immunoglobulin superfamily member 
XX 

PS Claim 1; Page 65-71; 80pp; English. 
XX 

CC This invention .describes novel Robo (roundabout) polypeptides, involved 

CC in nerve guidance which heve been isolated from Drosophila sp., 

CC C. elegans, human and murine samples. The products of the invention can 

CC be used to raise anti-Robo antibodies, which can be used to modulate cell 

CC function or morphology. The Robo polynucleotides and fragments are useful 

CC as probes and primers and for production of the Robo polypeptides . The 

CC probes and primers are also useful in screening assays. 
XX 

SQ Sequence 1649 AA; 



Query Match 90.8%; Score 691; DB 20; Length 1649; 

Best Local Similarity 88,5%; Pred. No. 1.2e-62; 

Matches 131; Conservative 5; Mismatches 12; Indels 0; Gaps 0; 

Qy 1 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKK 60 

l!!ll!!lll'!!!!!ll!!ll!!imi!!!!l!l!lll!ll!ll!lll hi! Illll 
Db 1402 aqavaaaaey£.glkvarrqmqdaagrrhfhasqcprptspvstdsnmsaavmqktrpakk 1461 
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Qy 61 QKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDRSSD 120 

IIIIHMII I lllllllllllllllll III I M 1 1 ! 1 : 1 1 : 1 |::|||||||| 
Db 1462 lkhqpghlrretytddlppppvpppaiksptaqsktqlevrpvvvpklpsmdartdrssd 1521 

Qy 121 RKGGSYKGREALDGRQVTDLRTNPSDPR 148 

Ml llllll llllll MUM III 
Db 1522 rkgssykgrevldgrqvvdmrtnpgdpr 1549 



RESULT 4 
Y13566 

ID Y13566 standard; Protein; 1651 AA. 

AC Y13566; 

DT 30-JUL-1999 (first entry) 

Human Robo 1 polypeptde. 

KW Comm polypeptide; Robo polypeptide; comraissureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Homo sapiens. 
XX 

PN W09925833-A1. 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; 98WO-OS24327. 
XX 1 

PR 14-ft)V-1997; 97US-0065543. 
XX 

PA (REGC ) CINIV CALIFORNIA. 
XX 

PI Goodman c, Kid T, Mitchell U, Russell C, Tear G; 

XX 

DR WPI; 1999-338008/28. 

DR N-PSDB; X55770. 
XX 

PT Modulation of Robo-Comm polypeptide interactions 
XX 

PS Disclosure'; Page 44-48; 56pp; English. 
XX ; j 

CC The-lnventiion relates to a method for modulating the amount of Comm 

IC (cotiimlssureless) polypeptide in contact with a cell expressing active 

i Robo (roundabout) on its surface. The method comprises modulating the 

C effective amount of Comm polypeptide in contact with the cell, where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Comm in contact with the cell. 

CC The method is use^ to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo: Comm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function. 
XX 

SQ Sequence 1651 AA; 



Query Match 90. 81; Score 691; DB 20; Length 1651; 

Best Local Similarity 88.5%; Pred. Ho. 1.2e-62; 

Matches 131; Conservative 5; Mismatches 12; Indels 0; Gaps 

0y 1 AOAVAAAAEYAGLKVARROMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKK 60 

MIMMMMIIMMMMIIMIIMMIIMIMMMIMIMI 1:11 Mill 
Db 1404 aqavaaaaeyaglkvarrqmqdaagrrhfhasqcprptspvstdsnmsaavmqktrpakk 1463 

Qy 61 QKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKtASIEARTDRSSD 120 

1 1 1 1 1 1 1 M I I 1 1 1 M 1 1 1 1 1 1 1 f ! I M III IMIMhllll hMIIIMM 
Db 1464 lkhqpghlrretytddlppppvpppaiksptaqsktqlevrpmpklpsmdartdrssd 1523 



Qy 121 RKGGSYKGREALDGRQVTDLRTNPSDPR 148 

III llllll llllll 1:1111 III 
Db 1524 rkgssykgrevldgrqwdmrtnpgdpr 1551 



RESULT 5 
Y13567 

ID Y13567 standard; Protein; 434 AA. 
XX 

AC Y13567; 
XX 

DT 30-JDL-1999 (first entry) 
XX 

DE Human Robo 2 polypeptde. 
XX 

KW Comm polypeptide; Robo polypeptide; commissureless; roundabout; 

KW modulation; nerve cell function. 

XX 

OS Homo sapiens. : 
XX 

fh Key Location/Qualifiers 

FT Misc-difference,285 

FT '/label- unknown 

FT :/note- "encoded by GIN" 

FT Misc-difference 396 

FT '/label- unknown 

FT /note- "encoded by ntt" 

xx ■ 

PN W09925833-A1, 
XX 

PD 27-MAY-1999. 
XX 

PF 13-NOV-1998; 98WO-0S24327. 
XX 

PR 14-NOV-1997; 97US -0065543 . 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Goodman C, Kid T, Mitchell KJ, Russell C, Tear G; 
XX 

DR WPI; 1999-338008/28. 

DR N-PSDB; X55771. 
XX 

PT Modulation of Robo-Comm polypeptide interactions 
XX 

PS Disclosure; Page 49-50; 56pp; English. 
XX 

CC The invention relates to a method for modulating the amount of Comm 

CC (commissureless) polypeptide in contact with a cell expressing active 

CC Robo (roundabout) on its surface. The method comprises modulating the 

CC effective amount of Comm polypeptide in contact with the cell, where the 

CC amount of expressed active Robo is specifically modulated inversely with 

CC the modulation of the effective amount of Comm in contact with the cell. 

CC The method is used to modulate the amount of active Robo expressed on a 

CC cell. The method can be used to screen for agents that modulate Robo: Comm 

CC interactions. This is particularly useful for modulating nerve cell 

CC function. 

XX 

SQ Sequence 434 AA; 



Query Match 14,2%; Score 108; DB 20; Length 434; 

Best Local Similarity 57.9%; Pred. No. 0.0023; 

Matches 22; Conservative 4; Mismatches 12; Indels 0; Gaps 

Qy 29 FHASQCPRPTSPVSTDSNMSAWIQKARPAKKQKHQPG 66 

I Ml llllll Mill 11:1 II Ml: I 
Db 397 ftssqrprptspfstdsntsaalsqsqrprptkkhkgg 434 



RESULT 6 
Y08405 

ID Y08405 standard; Protein; 434 AA. 
XX ' 
AC Y08405; '; 
XX 
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XX 
PT 



24-JUL-1999 (first entry) 
Human partial ROB02 protein. 

R0B01; ROB02; roundabout; nerve guidance; human; murine; cell function; 
cell morphology; screening assay. 



Homo sapiens. 
WO9920764-A1. 
29-APR-1999 . 

20-OCT-1998; 98WQ-0S22164. 

14-NOV-1997; 97US-0971172. 
20-OCT-1997; 97US-0062921. 

(REGC ) (JNIV CALIFORNIA. 

Goodman csjt Kidd T, Mitchell KJ, Tear G; 



WPI; 1999-; 
N-PSDB; X5' 



12615/26. 
254. 



Robo polypeptides, a new immunoglobulin superfamily member 
Claim 1; Page 72-73; 80 



This invention describes novel Robo (roundabout) polypeptides, involved 
in nerve guidance which heve been isolated from Drosophila sp., 
C, elegans, human and murine samples. The products of the invention can 
be used to raise anti-Robo antibodies, which can be used to modulate cell 
function or morphology. The Robo polynucleotides and fragments are useful 
as probes and primers and for production of the Robo polypeptides. The 
probes and primers are also useful in screening assays. 

Sequence 434 AA; \ 



Query Match 14.2%; Score 108; DB 20; Length 434; 

Best Local Similarity 57.9%; Pred. No. 0.0023; 

Matches 22; Conservative 4; Mismatches 12; Indels C 

3y 29 FHASQCPRPTSPVSTDSNMSAWIQKARPAKKQKHQPG 66 

. I :ll HUM lllll II : I II Ml: I 
Db 397 ftssqrprptspfstdsntsaalsqsqrprptkkhkgg 434 



:sult 7 
076 

W72076 standard; Protein; 515 AA. 
XX 

AC W72076; 
XX 

DT ],8-DEC-1998 (first entry) 

XX 

DE HSV-2 strain SB5 fbntig ID 96 ORFI4 protein. 
XX 

KW HSV-2 strain SB5; immunological response induction; 

KW antiviral identification; viral protein inhibitor. 
XX 

OS Herpes simplex virus type 2. 
XX 

PN WO9820016-A1. 
XX 

PD 14-MAY-1998. 

PF 31-OCM997; 97WO-OS20016. 
XX 

PR 09-JON-1997; 97OS-0049018. 

PR 04-NOV-1996; 96OS-Q030279, 
XX 



PA (SMIK ) SMITHKLINE BEECHAM CORP. 
XX 
PI 
PI 
XX 



Chan JY, Dabrowski-Amaral CE, Delvecchio AM, Dillon SB; 
Esser KM, Leary jj; 

WPI; 1998-286847/25. 
N-PSDB; V6215Q.:- 

Herpes simplex virus type-2 sequences - useful in, e.g. prevention 
and treatment of infection or inducing immunological response in 
mammal 



Claim 10; Page 70; 748pp; English, 

This sequence represents a Herpes simplex virus type-2 (HSV-2) protein 
sequence of the,- invention. This sequence was isolated from a HSV-2 strain 
SB5 (deposited as ATCC VR-2546) DNA fragment designated Contig ID 96. 
Based on homology, this sequence is a immediate -early protein IE68. 
The proteins can be used for the treatment or prevention of disease, tc 
induce an immunological response in a mammal or to identify inhibitors, 
activators or novel antivirals. Antagonists of the proteins can be used 
to inhibit a viral polypeptide. The DNA sequence or a vector containing 
, it can also be used to induce an immunological response in a mammal . 

Sequence 515 AA; 



Query Match 13.3%; Score 101; DB 19; Length 515; 

Best Local Similarity 27.7%; Pred. No. 0.014; 

Matches 31; Conservative 15; Mismatches 38; Indels 28; Gaps 4; 

3y 63 HQPGHLRR - EAYADDLPP PPVPPPAIKS — PTVQSKAQLEVRPVMV 105 

!: I: I: | Ml II III : I : I :: I :| 

Db 90 hrcgnwrqgvatmadippdppavnttpanhappspppgsrkrrrpvlpsssesegkpdte 149 

2y 106 PKLASIEARTDRSSDRKGGSYKGREALDGRQVTDLR TNPSD 146 

: :| I: I : I Ml : Ml II I III 

Db 150 sessstessedeagdlrggrrrsprelggryfldlsaesttgtesegtgpsd 201 



RESULT 8 
R22248 

ID R22248 standard;. Protein; 308 AA. 
XX 

AC R22248; 
XX 

DT 22-JUL-1992 (first entry) 
XX 

DE Sequence of rye-grass pollen allergens encoded by EcoRl insert 
DE lambda -12R. 
XX 



KW Rye grass pollihosis; diagnosis; therapy. 
XX ' 
OS Lolium perenne. , 



XX 

FH Key 
FT 
FT 
XX 

PN WO9203550-A. 
XX 

PD 05-MAR-1992. 

PF 16-AOG-1991; 
XX 

PR 17-AOG-1990; 
XX 

PA (UYME-) [JNIV MELBOURNE, 
XX 

PI Singh MB, 
XX 

DR WPI; 1992-096894/12. 
DR N-PSDB; Q23000.*-' 



, Location/Qualifiers 

rl. .25 

•/label- signal 



91WO-AO00369. 
90AO-0001823. 



T, Knox RB, Avjioglu A; 



i Best Available Copy 
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New nucleic acid sequences coding rye-grass pollen allergens • 
esp. Lol pla and Lol plb and their fragments, for diagnosing and 
detecting rye-grass pollinosis 

Disclosure; ; 81pp; English, | 

The inventors claim a sequence encoding the r|= grass pollen 
allergen Lol pla, or an antigenic fragment, the allergen can 
alternatively be Lol plb, The antigenic fragment has T-cell 
stimulating activity and IgE stimulating activity. It does not bind 
igE specific for rye grass pollen however, it may be encoded by 
clone 12R (Q23000) or 26. j (Q22246). R22248 is as printed in the 
specification; several codons in Q23000 do not translate into the 
amino acids written in the specification. 
e> 

Sequence 308 AA; 



WQuery Match 13,1%; Score 99.5; DB 13; Length 308; 

■ Best Local Similarity 29,6%; Pred. No. 0.011; 
Matches 42; Conservative 10; Mismatches 49; Indels 41; Gaps 

Qy 3 AVAAAAEYAGLPARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAVVIQKARPAKKQK 62 

III I : : :||| I II: I I I 
Db 39 atpaatpaggwregddrraeaaggrqrlasrqpwpplptp 78 

Qy • 63 HQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDRSSDRK 122 

III : II I II I III :|| I ::||| I : I 
Db 79 lrrtsstsstppspsppra-ssptsaaka pglipkl dtayd-- 118 

Qy 123 GGSYKGREALDGRQVTDLRTNP 144 

:| || || || | 
Db 119 -vaykaaeahprgqvrrlrhcp 139 



RESULT 9 
Y25601 

ID Y25601 standard; protein; 308 AA, 
XX 

AC Y25601; 
XX 

DT 30-SEP-1999 (first entry) 
XX 

DE Lolium sp. allergen 2498581 Lol p 5a protein fragment. 

".X 

Major histocompatibility complex; class II; desensitising; human; 
allergen; grass; tree; weed; pollen; fungi; mould; food; insect; sting, 
KW chiromidae; spider; mite; housefly; fruit fly; sheep blow fly; honeybee: 
KW screw worm fly; grain weevil; silkworm; bee moth; larvae; mealworm; cat; 
KW cockroach; beetle; dog; horse; cow; pig; sheep; rabbit; rat; guinea pig; 
KW mice; gerbil; vaccine; treatment; prevention; hypersensitivity 
XX 

OS Lolium sp. 
XX 

PN W09934826-A1. 
XX 

PD 15-JUL-1999, 
XX 

PF ll-JAH-1999; 99WO-GB00080. 
XX 

PR 21-SEP-1998; 98GB-QQ20474. 
PR 09-JAN-1998; 98GB-0000445. 
XX 

PA (IMCO-) IMPERIAL COLLEGE INNOVATIONS LTD. 
XX 

PI Kay AB, Larche M; 
XX 

DR WPI; 1999-458255/38. 
XX 

PT Desensitizing patients to polypeptide allergens 
XX 

PS Example 6; Page 55; 117pp; English. 



C 



XX 

CC This invention describes a novel method of desensitizing a patient to a 

CC polypeptide allergen and comprises administering to the patient a peptide 

CC derived from the' allergen where restriction to a MHC Class II molecule 

CC possessed by the patient can be demonstrated for the peptide and the 

CC peptide is able -to induce a late phase response in an individual who 

CC possesses the MHC Class II molecule. The methods can be used for 

CC desensitising patients to allergens present in e.g. grass, tree and weed 

CC (including ragweed) pollens, fungi and moulds, foods, stinging insects, 

CC the chiromidae (non-biting midges), spiders and mites, housefly, fruit 

CC fly, sheep blow.' fly, screw worm fly, grain weevil, silkworm, honeybee, 

CC non-biting midge larvae, bee moth larvae, mealworm, cockroach, larvae of 

CC Tenibrio molitor beetle, mammals such as cat, dog, horse, cow, pig, 

CC sheep, rabbit, rat, guinea pig, mice or gerbil. They can also be used to 

CC produce immunological vaccines which may be used to prevent and/or treat 

CC conditions involving hypersensitivity to allergens, This sequence 

CC represents the Lolium sp. allergen 2498581 Lol p 5a. 
XX 

SQ Sequence 308 ta; 



Query Match ' 12,8%; Score 97.5; DB 20; Length 308; 
Best Local Similarity 29,6%; Pred. No. 0.018; 

Matches 42; Conservative 10; Mismatches 49; Indels 41; Gaps 5; 

Qy 3 AVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKKQK 62 

I II I.: : Mil I II: I I I 
Db 39 atpaatpaggwregddr raeaagg rqr lasrqpwpp lptp 78 

Qy 63 HQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDRSSDRK 122 
III ::' II I II I 111 :M I ::|ll I : I 

Db 79 lrrtssrssrppspsppra-ssptsaakav— pglipkl dtayd- 118 



Qy 123 GGSYKGREALDGRQVTDLRTNP 144 

:|| II II II I 
Db 119 -vaykaaeahprgqvrrlrhcp 139 



RESULT 10 
W72159 

ID W72159 standard; Protein; 414 AA. 
XX 

AC W72159; 
XX 

DT 08-JAN-1999 (first entry) 
XX 

DE HSV-2 strain SB5 Contig ID 12 ORF#l p 
XX 

KW HSV-2 strain SB5; immunological response induction; therapy; 

KW antiviral identification; viral protein inhibitor, 

XX 

OS Herpes simplex virus type 2. 
XX 

PN WO9820016-A1. . 
XX 

PD 14-MAY-1998. • 
XX 

PF 31-OCT-1997; 97WO-US20016. 
XX 

PR 09-JUN-1997; 97US-0049018. 
PR 04-NOV-1996; 96US-0030279. 
XX 

PA (SMIK ) SMITHKLINE BEECHAK CORP. 

xx 1 

PI Chan JY, Dabrowski-Amaral CE, Delvecchio am, Dillon SB; 

PI Esser KM, Leary JJ; 

XX 

DR WPI; 1998-286847/25. 

DR N-PSDB; V62175. : 
XX 

PT Herpes simplex virus type-2 sequences - useful in, e.g. prevention 

PT and treatment o'f infection or inducing immunological response in 

PT mammal 
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xx 

PS Claim 10; Page 104; 748pp; English. 
XX 

cc This sequence represents a Herpes simplex virus type- 2 (HSV-2) protein 

CC sequence of the invention. This sequence was isolated from a HSV-2 strain 

CC SB5 (deposited as ATCC VR-2546) DNA fragment designated Contig ID 12, 

CC Based on homology, this sequence is a immediate-early protein IE68. 

CC The proteins can be used for the treatment or prevention of disease, to 

CC induce an immunological response in a mammal or to identify inhibitors, 

CC activators or novel antivirals . Antagonists of the proteins can be used 

CC to inhibit a viral polypeptide. The DNA sequence or a vector containing 

CC it can also be used to induce an immunological response in a mammal. 

XX 

SQ Sequence 414 AA; 



Query Match 12.8*; Score 97.5; DB 19; Length 414; 

Best Local Similarity 27.6%; Pred. No, 0.026; 

Matches 27; Conservative 12; Mismatches 32; Indels 27; Gaps 3; 



♦ 



76 DLPP PPVPPPAIKS— PTVQSKAQLEVRPVMVPKLASIEARTDRSS 119 

1:1 II III : | : | :: | :| : :| |: | : 

3 dippdppalnttpanhappspppgsrkrrrpvlpsssesegkpdtesessstessedeag 62 



Qy (120 DRKGGSYKGREALDGRQVTDLR ' 
:N : ill II 
relg 



— TNPSD 146 

I HI 

63 dlrggrrrsprelggryfldlsaesttgtesegtgpsd 100 



RESULT 11 
W72139 

ID W72139 standard; Protein; 414 AA. 
XX 

AC W72139; 
XX 

DT 23-DEC-1998 (first entry) 
XX 

DE HSV-2 strain SB5 Contig ID 3 ORFU protein. 
XX 

KW HSV-2 strain SB5; immunological response induction; therapy; 

KW antiviral identification; viral protein inhibitor. 

XX 

OS Herpes simplex virus type 2. 
XX 

PN WO9820016-A1. 
XX 

PD 14-MAY-1998. 
XX 

PF 31-OCT-1997; 97WO-US20016. 



i 



XX 



09-JUN-1997; 97OS-0049018. 
04-NOV-1996; 96OS-0030279. 



PA (SMIK ) SMITHKLINE BEECHAM CORP. 
XX 

PI Chan JY, Dabrowski-Amaral CE, Delvecchio AM, Dillon SB; 

PI Esser KM, Leary JJ; 

XX 

DR WPI; 1998-286847/25. 

DR N-PSDB; V62164 . 
XX 

PT Herpes simplex virus type-2 sequences - useful in, e.g. prevention 

PT and treatment of infection or inducing immunological response in 

PT mammal 

XX 

PS Claim 10; Page 97-98; 748pp; English. 
XX 

CC This sequence represents a Herpes simplex virus type-2 (HSV-2) protein 

CC sequence of the invention. This sequence was isolated from a HSV-2 strain 

CC SB5 (deposited as ATCC VR-2546) DNA fragment designated Contig ID 3. 

CC Based on homology, this sequence is a immediate-early protein IE68. 

CC The proteins can be used for the treatment or prevention of disease, to 

CC induce an immunological response in a mammal or to identify inhibitors, 



CC activators or novel antivirals. Antagonists of the proteins can be used 

CC to inhibit a viral polypeptide. The DNA sequence or a vector containing 

CC it can also be used to induce an immunological response in a mammal. 
XX 

SQ Sequence 414 AA; 



Query Match 12 .8%; Score 97.5; DB 19; Length 414; 

Best Local Similarity 27.6%; Pred, No, 0,026; 

Matches 27; Conservative 12; Mismatches 32; Indels 27; Gaps 3; 

Qy 76 DLPP : PPVPPPAIKS— PTVQSKAQLEVRPVMVPKLASIEARTDRSS 119 

Ml \ 1 1 1 1 1 : : I :: I :l : :| |: I : 
Db 3 dippdppalnttpanhappspppgsrkrrrpvlpsssesegkpdtesessstessedeag 62 

Qy 120 DRKGGSYKGREALDGRQVTDLR TNPSD 146 

I :|| : I II II I III 

Db 63 dlrggrrrsprelggryfldlsaesttgtesegtgpsd 100 



RESULT 12 
R35451 

ID R35451 standard; Protein; 821 AA. 
XX 

AC R35451; 
XX 

DT 25-A0G-1993 (first entry) 
XX 

DE Mouse eps8. 
XX 

KW Epidermal growth factor receptor; EGFR-pathway substrate; eps; 

KW tyrosine kinase receptor; TKR; SH2; SH3; mitogenesis. 

XX 

OS Mus musculus. ' 
XX 

PN US7935311-A. / 
XX 

PD 01-APR-1993. 
XX 

PF 25-A0G-1992; 92US-0935311. 
XX 

PR 25-AOG-1992; 92DS-093 5311 . 
XX 

PA (OSSH ) US DEPT HEALTH & HUMAN SERVICE. 
XX 

PI Di Fiore PP, Fazioli F; 
XX 

DR WPI; 1993-159477/19. 
DR N-PSDB; Q40730 ' ' 
XX 

PT Epidermal growth factor receptor substrate, eps 8 - used to 
PT enhance mitogenic response of cells to epidermal growth factor 
XX 

PS Disclosure; Page 30-37; 40pp; English. 
XX 

CC Eps8 is a novel EGFR substrate. The protein bears the 

CC characteristic' signatures of TKR substrates including SH2 and 

CC SH3 domains. Eps8 is involved in the transduction of mitogenic 

CC signals and it can be used to enhance the mitogenic response of 

CC cells to EGF. ■ 

XX 

SQ Sequence 821 AA; 



Query Match 11.6%; Score 88.5; DB 14; Length 821; 

Best Local Similarity 24.1%; Pred. No. 0.48; 

Matches 47; Conservative 21; Mismatches 72; Indels 55; Gaps 

iy 2 QAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKKQ 61 

III I I ': :| : : I II : : II: I II 
Db 137 qavvhacsydsi-lalvckeptqskpdlhlfqcdevkanlisediesaisdsk— ggkq 192 



Qy 62 KHQPGHLRREAYAD ■ DLPPP PVPPPAIKSPTVQSKA-- 



■ 96 



j> Best Available Copy 
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Db 



Mill I! :||| INI : |:|: 
193 krrpealrmiakadpgipppprapapvppgtvtqvdvrsrvaawsawaadqgdfekprqy 252 



Qy 97 -QLEVRPVM VPKL-ASIEARTDRSSDRKGGSYKGREAL 132 

: I I I : II : II :: I :| I I : 

Db 253 heqeetpemmaaridrdvqilnhilddieffitklqkaaeafselskrkk--skkskrkg 310 

Qy 133 DGRQVTDLRTNPSDP 147 

I I II I I 
Db 311 pgegvltlrakpppp 325 



RESULT 13 
W69741 

ID W69741 standard; protein; < 
W69741; 



AC 



26-OCT-1998 (first entry) 



DE SAPAP2 protein. 



Human; SAPAPl; SAPAP2; animal protein; PSD-95/SAP90; diagnosis; I 
nervous disease; functional interference; structural interferenJ 
membrane associated guanylate kinase; neuronal disease, I 



KW 

KW 
KW 
XX 

OS Homo sapiens. 
XX 

PN JP10201477-A. 
XX 

PD 04-AUG-1998. 
XX 

PF 24-JAN-1997; 97 JP-0011714 . 
XX 
PR 
XX 
PA 
PA 
XX 
DR 
XX 
PT 
PT 
XX 



24-JAN-1997; 97JP-0011714. 



(RAGA-) KAGAKD GIJUTSU SHINKO JIGYODAN. 
(TAKE/) TAKEUCHI M. 



WPI; 1998-474491/41. 



New protein SAPAPl ■ used for, e.g. diagnosis and prevention of 
various neuronal diseases 



I 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

XX 

so 



Disclosure; Page 7-9; 12pp; Japanese. 

The present sequence represents the SAPAP2 protein which is mentioned in 
the present invention. The present invention specifically claims the 
SAPAPl protein which has a 992 amino acid (aa) sequence. Also described 
in the present invention are: (1) an animal protein having an aa sequence 
substantially homologous to SAPAPl; (2) cDNA sequence encoding SAPAPl, 
or an aa sequence substantially homologous to SAPAPl, and (3) a genomic 
DNA sequence hybridised to the cDNA or its partial sequence. SAPAPl is a 
novel animal protein specific for PSD-95/SAP90 and its related protein, 
and may be useful for the diagnosis, prevention and treatment of various 
neuronal diseases caused by functional or structural interference of 
Aervous system. 

*» 

Sequence 980 AA; 



Query Match 11.61; Score 88; DB 19; Length 980; 

Best Local Similarity 26.6%; Pred. No. 0.66; 

Matches 29; Conservative 13; Mismatches 39; Indels 28; Gaps 4; 

Qy 40 PVSTDSNMSAWIQKARPA-— KKQKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSK 95 

II I llh: : I : II llllll I : 

Db 512 pvmtpsnmtstirstaavsytnykk tpppvpprttskplisvt 554 

Qy 96 AQLEVRPVMVPKLASIEARTDRSS DRKGGSYKGREALDGRQVTDL 140 

II : I ::l I I I :ll I ::ll : :l 

Db 555 aqsstest---qdayqdsraqrmspwpqdsrgglynsmdsldsnkamnl 600 



ID W69743 standard; protein; 980 AA. 



W69743; 

26-OCT-1998 (first entry) 
SAPAPl protein.' < 

Human; SAPAP2; SAPAPl; animal protein; PSD-95/SAP90; diagnosis; 
nervous disease;' functional interference; structural interference; 
membrane associated guanylate kinase; neuronal disease. 



JP10201478-A. 
04-AUG-1998. 

24-JAN-1997; 97JP-0011715. 

24-JAN-1997; 97 JP- 0011715. 



XX 
PD 
XX 

PP 

XX 
PR 
XX 
PA 
PA 
XX 

DR WPI; 1998-474492/41, 
XX 
PT 
PT 
XX 
PS 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

XX 

SQ Sequence 980 AA; 



(RAGA-) KAGAKD GIJUTSU SHINKO JIGYODAN. 
(TAKE/) TAKEUCHI M. 



DNA encoding new animal protein SAPAP 2 - useful for diagnosis and 
treatment of nervous system diseases 

Disclosure; Page' 4-6; llpp; Japanese. 

The present sequence represents the SAPAPl protein, which is mentioned 
in the present invention which specifically claims SAPAP2 . Also described 
in the present invention is: (A) an animal protein having an amino acid 
sequence substantially the same as SAPAP2; (B) a cDNA sequence encoding 
the amino acid sequence of SAPAP2 or (C) an amino acid sequence 
substantially same as SAPAP2; and (D) a genomic DNA sequence hybridised 
by the above cDNA or its partial sequence. SAPAP2 is a new animal 
protein which combines specifically with PSD-95/SAP90 and its related 
protein and is useful for the diagnosis, prevention and treatment of 
various nervous diseases caused by functional or structural interference 
of nervous system. 



Query Match 11. 6%; Score 88; DB 19; Length 980; 

Best Local Similarity 26.64; Pred. No. 0,66; 

Matches 29; Conservative 13; Mismatches .39; Indels 28; Gaps 

Qy 40 PVSTDSNMSAWIQKARPA — KKQKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSK 95 

II I III::': I : II llllll I : 

Db 512 pvmtpsnmtstirstaavsytnykk tpppvpprttskplisvt 554 

Qy 96 AQLEVRPVMVPKLASIEARTDRSS — DRKGGSYKGREALDGRQVTDL 140 

II ■: I ::| I I I =11 I "II : :| 

Db 555 aqsstest"-qdayqdsraqrmspwpqdsrgglynsmdsldsnkamnl 600 



RESULT 15 
Y78901 

ID Y78901 standard;' protein; 2424 AA. 
XX 

AC Y78901; 
XX 

DT 19-MAY-2000 (first entry) 
XX 

DE Calcium channel. alpha 1A amino acid sequence. 
XX 
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KW Alpha 1A subunit; central nervous system calcium channel; dementia; 

KW Alzheimer's disease; calcium channel function evaluation; 

KW cerebral function. 
XX 

OS Xenopus sp. 
XX 

PN JP2000026315-A. 
XX 

PD 25-JAN-2000. 

PF 09-JUL-1998; 98JP-0194236. 
XX 

PR 09-JUL-1998; 98 JP-0194236 . 
XX 

PA (DADT ) DAIICHI PHARM CO LTD. 
XX 

DR DPI; 2000-176900/16. 
XX ( 

PI Evaluating calcium channel activator - involves measuring inhibition or 

PT change in functions of coupling between alpha subunit of G-protein and 

J|T alpha-1 subunit of C-terminal area, used for treating dementia and 

KT improving cerebral functions 

T>s Example 1; Page 9-10; llpp; Japanese, 
XX 

CC This sequence represents the alpha 1A subunit of the Xenopus calcium 

CC channel of the central nervous system. The invention relates to the 

CC evaluation of a calcium channel activator, The evaluation process 

CC consists of measuring the inhibition or change in function of coupling 

CC between a G-protein alpha subunit and the alpha 1 subunit of the calcium 

CC channel of the central nervous system. Test compounds which are found to 

CC cause the inhibition of coupling or a change in function can be 

CC identified from tests carried out on genetically engineered cells , The 

CC evaluation method may be used to discover compounds for treating dementia 

CC and improving cerebral function in diseases such as Alzheimer's disease, 
xx 

SQ Sequence 2424 AA; 



Query Match 11.2%; Score 85; DB 21; Length 2424; 

Best Local Similarity 25.6%; Pred. No. 3.9; 

Matches 34; Conservative 15; Mismatches 46; Indels 38; Gaps 6; 

Oy 18 RQMQDAAGRRHFHASQCPRPTSPVSTDS— NMSAWIQKARPAKKQKHQPG 66 

I: = II : I I I:: N I : : llll |: II 
Db 2294 rrrrggggra---lrrapgpreplaqdspgrgpsvclaraarpagpqrllpgprtgqapr 2350 

Oy 67 HLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDR 117 

"II I Ml III :| I I: ::|: 

( 2351 arlpqkparsvqrerrglvlsppp-pppgelap rahpartprpgpgdsrsrr 2401 
118 SSDR KGG 124 

I III 
Db 2402 ggrrwtasagkgg 2414 



Search completed: January 22, 2001, 12:19:50 
Job time: 1747 sec 
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30 


82 


10.8 


867 




L96 protein • Tipu 




GenCore version 4.5 


31 


81.5 


10.7 


537 




myosin 'binding pro 




Copyright (c) 1993 - 2000 


Compugen Ltd. 


32 


81.5 


10.7 ■ 


555 


153869 


zinc finger protei 












i «'■!'■ 






structural protein 










01. J 


10.7 


one 

806 




probable sensor ki 


OM protein - protein search, using sw model 




35 




10.7 






epidermal growth f 








36 


81.5 


10.7 


963 




hypothetical prote 


Run on: 


January 22, 2001, 12:27:28 


; Search time 325.28 Seconds 


37 


8L5 


io!? 


975 


T08606 


protein phosphatas 






(without alignments) 


38 


81.5 


10.7' 


1603 


S17983 


gene posterior sex 






30.894 Million cell updates/sec 


39 


80.5 


10.6 


197 


F82029 


probable periplasm 








40 


80.5 


10.6 


417 


A72078 • 


ct005 hypothetical 


Title: 


US-09-540-245A-20 




41 


80.5 


10.6 


417 


B81590 


conserved hypothet 


Perfect score: 


761 




42 


80.5 


10.6 


548 


S59133 


ETS2 repressor fac 


Sequence: 


1 AQAVAAAAEYAGLKVARRQM 


REALDGRQVTDLRTNPSDPR 148 


43 


80.5 


10.6 


2142 


B35098 


MHC class III hist 








44 


80 


10.5 


202 


T11744 


dehydrin - kidney 


Scoring table: 


BLOSUM62 




45 


80 


10.5 


383 


A48222 


dematin 48K chain 



:ned: 



Gapop 10.0 , Gapext 0.5 

195891 seqs, 67900655 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summarifes 



PIRJ6:* 
1: pirl:* 
2: pir2:* 
3: pir3:* 
4: pir4 : * 



Pred. no, is the number of results (predicted by chance to have a 
score greater than or equal to the 'score of the result being printed, 
and is derived by analysis of the total score distribution, 



t 

Query 



NO, 


Score 


Match Length DI 


ID 


Description 
















Query Match 100.0%; Score 761; DB 2; Length 1612; 


1 


761 


100.0 


1612 


T30805 


duttl protein - mo 


Best Local Similarity 100.0%; Pred, No. 4e-57; 


2 


719 


94.5 


1651 


T14160 


transmembrane rece 


Matches 148; Conservative 0; Mismatches 0; Indels 0; Gaps 0 




101 


13.3 


1344 


T14316 


rig-1 protein - mo 




4 


99 


13.0 


739 


T21431 


hypothetical prote 


Qy 1 AQAVAAAAEYMLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAVYIQKARPAKK 60 


5 


97.5 


12.8 


308 


A38582 


pollen allergen pi 


!IIIIIIIIIIIIII!IIIIIII!IIIIIIIIIIIIIMIIIIII!II!IIII!!I!I!I 


6 


91 


12.0 


825 


T27852 


hypothetical prote 


Db 1365 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKK 1424 


7 


89 


11.7 


1882 


GNWTR 


genome polyprotein 




8 


88.5 


11.6 


821 


S39983 


eps8 protein - mou 


Qy 61 QKHQPGHLRRSAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDRSSD 120 


9 


87 


11,4 


826 


A60385 


monocyte surface a 




10 


86.5 


11.4 


891 


T22560 


hypothetical prote 


Db 1425 QKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDRSSD 1484 


11 


85.5 


11.2 


662 


T23757 


hypothetical prote 




12 


85.5 


11,2 


1870 


S37671 


MHC class III hist 


Qy 121 RKGGSYKGREALDGRQVTDLRTNPSDPR 148 


13 


85.5 


11,2 


1872 


S36152 


MHC class III hist 


imiiimiiiimmiiimii 


14 


85 


11.2 


300 


T49225 


hypothetical prote 


Db 1485 RKGGSYKGREALDGRQVTDLRTNPSDPR 1512 


15 


85 


11,2 


698 


T32594 


hypothetical prote 




16 


85 


11.2 


716 


T26998 


hypothetical prote 




17 


85 


11.2 


1541 


T02831 


AAA protein L4171. 


RESULT 2 


18 


85 


11.2 


2424 


146480 


calcium channel BI 


T14160 


19' 


85 


11.2 


2424 


146479 


calcium channel BI 


transmanbrane receptor protein Robol ■ rat 


20 


83.5 


11.0 


282 


S53502 


histone HI ■ commo 


C;Species: Rattus norvegicus (Norway rat) 


21 


83.5 


11.0 


365 


T24955 


hypothetical prote 


C;Date: 20-Sep-1999 fsequence_revision 20-Sep-1999 ttext.change 20-Sep-lS59 


22 


83 


10.9 


1336 


S25716 


Ras guanine nucleo 


C;Accession: T14160 


23 


83 


10.9 


1560 


T42727 


proliferation pote 


R;Kidd, T,; Brose, K'.; Mitchell, K.J.; Fetter, R.D.; Tessier-Lavigne, M. ; GxC. 


24 


82.5 


10.8 


1013 


T46422 


hypothetical prote 


Cell 92, 205-215, 1998 


25 


82,5 


10.8 


1102 


JC6316 


probable protein k 


A;Title: Roundabout controls axon crossing of the CNS midline and defines a r.c 


26 


82.5 


10,8 


1547 


T28657 


blackjack protein, 


A; Reference number: Z17897; MOID: 98117249 


27 


82 


10.8 


363 


T16755 


hypothetical prote 


A;Accession: T14160 


28 


82 


10.8 


601 


S56144 


SH3 domain binding 


A; Status: preliminary; translated from GB/EMBL/DDBJ 


29 


82 


10.8 


797 


146044 


furin (EC 3.4,21.7 


A; Molecule type: mRNA 



RESULT 1 
T30805 

duttl protein ■ mouse- 

N; Alternate names: transmembrane receptor protein Robol homolog 
C'Species: Mus musculus (house mouse) 

C;Date: 22-Oct-1999 isequence.revision 22-Oct-1999 ttext.change 22-Oct-1999 
C;Accession: T30805 

R;Wu, M.C.; Lowe, N.; Fordham, R. ; Rabbitts, P. 
submitted to the EMB.L Data Library, July 1998 

A; Description: The mouse homologue of human dottI/h- robol gene; protein sequence and 
A; Reference number: Z20879 
A;Accession: T30805 , 

A; Status: preliminary.; translated from GB/EMBL/DDBJ 
A;Molecule type": mRNA 
A; Residues 1-1612 <5iUM> 

A; Cross -references: EMBL:Y17793; NID:el329712; PID:el329713; PIDN: CAA76850 . 1 
A; Experimental sourc*: brain 
C; Genetics: 
A; Gene: duttl 
A; Map position: 16 
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A; Residues: 1-1651 <KID> 

A; Cross -references: EMBL:AF041082; NID:g2811215; PID:g2811216; PIDN: AAC39960 . 1 
C; Function: 

A; Description: appears to function as the gatekeeper controlling midline crossing 
C; Keywords: transmembrane protein 



Query Match 94.5%; Score 719; DB 2; Length 1651; 

Best Local Similarity 93.2%; Pred. No. 1.6e-53; 

Matches 138; Conservative 3; Mismatches 7; Indels 0; Gaps 

Qy 1 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPISPVSTDSNMSAWIQKARPAKK 60 

miMiiiMiiiiiiiMMMiiiimimiiiiiiimiiii mill! n 

Db 1404 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAAVIQKARPTKK 1463 

Qy 61 QKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDRSSD 120 

IIHIIIIIIIII IIIIIIIIIIIIIIIIMIIIIIII 11:1 lllllllll Hill 
Db 1464 QKHQPGHLRREAYTDDLPPPPVPPPAIKSPSVQSKAQLEARPIMGPKLASIEARADRSSD 1523 



Qy 121 RKGGSYKGREALDGRQVTDLRTNPSDPR 148 
! M I M I II ! 1 1 1 1 M I II 1 1 1 : 1 HI 
Hb 1524 RKGGSYKGREALDGRQVTDLRTSPGDPR 1551 



3 



T14316 f 

rig-1 protein ■ mouse 

C;Species: Mus musculus (house mouse) 

C;Date: 20-Sep-1999 tsequence_revision 20-Sep-1999 ftext_change 20-Sep-1999 
C;Accession: T14316 

R;Yuan, S.S.F.; Cox, L.A.; Dasika, G.K.; Lee, E.Y.H.P. 
submitted to the EMBL Data Library, April 1998 
A; Reference number: Z17975 
A; Accession: T14316 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A;Residues: 1-1344 <YUA> 

A;Cross-references: EMBL:AF060570; NID:g4206385; PID:g4206386; PIDN; AAD1162B . 1 



Query Match 13.3%; Score 101; DB 2; Length 1344; 

Best Local Similarity 27.7%; Pred. No. 0.64; 

Matches 38; Conservative 18; Mismatches 59; indels 22; Gaps 

3y 29 FHASQCPRPTSPVSTDSNMSAWIQ KARPAKKQKHQPGHLRREAYADDLPPP 80 

:!ll I h: I I : :|l II I I III INI! 

Db 1191 YHASPSPVPSTASSAPGRTRQVTGEMTPPLHGHRARIRKKPKALP--YRREHSPGDLPPP 1248 

3y 81 PVPPPAIKSPTVQSKAQLEVRPVMVPKL ASIEARTDRSSDRKGGSYKGREA 131 

t hill I I : |: I :| II I I hh 

■ 1249 PLPPPELRDKLALGSA-GSRQHVFPRARAQWGEESGAGSASRGPTSSQR-GPHPDGKES 1305 



132 LDGRQVTDLRTNPSDPR 148 

: : :|: |: 
1306 QGRGRGLEACRSPNSPQ 1322 



RESULT 4 
T21431 

hypothetical protein F26H11.4 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 tsequence_revision 15-Oct-1999 §text_change 15-Oct-1999 
C;Accession: T21431 
R;Barlow, K. 

submitted to the EMBL Data Library, November 1996 
A; Reference number: Z19421 
Accession: T21431 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type; DNA 
A; Residues: 1-739 <WIL> 

A; Cross -references: EMBL:Z81515; PIDN:CAB04196.1; GSPDB:GN00020; CESP;F26H11.4 
A;Experimental source; clone F26H11 
C; Genetics: 



A;Gene: CESP:F26H11.4 
A; Map position: 2 ;' 

A;Introns: 12/2; 103/3; 179/3; 246/2; 397/3; 410/1; 457/2; 480/3; 520/2; 647/3; 711/3 



Query Match 13.0%; Score 99; DB 2; Length 739; 

Best Local Similarity 27.9%; Pred. No. 0.5; 

Matches 48; Conservative 16; Mismatches 68; Indels 40; Gaps 7; 

3y 1 AQAVAAMEYAGLKVARRQMQDAAGRRHFHASQ— CPRPTSPVSTDSNMSAWIQKARP 57 

I I I 'I I ::| I : I II I II: I I : I ::| 
Db 84 ATTVDTVQEGIGATVDVTLVEDVAEKAHEPASTPNVADRLQKPVAKDDNSAPV--EEANR 141 



Qy 58 AKKQKHQP-- GHLRREA YADDLPPPPVPPPAIKSPTVQS 94 

II: I I II I I II II II I : 
Db 142 KKKEGKDPKNSEEAKKNSKKGKNGTARRSALKGVTIPPVAASLPCPPPPP PLSER 196 

Qy 95 KAQLEVRPVMVPKLASIEARTDRSSDRKGGSYKGREALDGRQVTDLRTNPSD 146 

I : :| :M I : : 1 1 1 1 I : I : I I I I II 
Db 197 KKNVGPKPCVGP AQKNLSSQRKHSSKRNRKGL-LRNVAKLWGRKSD 241 



RESULT 5 
A38582 

pollen allergen plb precursor - _ 
N; Alternate names: 30R allergen 
C; Species: Lolium perenne (perennial ryegrass) 

C;Date: 14-Feb-1992 frsequence.revision 14-Feb-1992 ttext_change 07-May-1999 
C'Accession: A38582; S38290 

R;Singh, M.B.; Hough, T.; Theerakulplsut, P.; Avjioglu, A,; Davies, S.; Smith, P.M.; 
Proc. Natl. Acad. Sci. U.S.A. 88, 1384-1388, 1991 

A; Title: Isolation of cDNA encoding a newly identified major allergenic protein of ry 
A;Reference number: A38582; MUID: 91142177 
Accession: A38582 • 

A; Status: preliminary; not compared with conceptual translation 

A; Molecule type: mRNA 

AjResidues: 1-308 <SIN> 

A;Cross-references: GB:M59163 

R;Petersen, A.; Schramm, G.; Becker, W.M.; Schlaak, M. 

Biol. Chem. Hoppe-Seyler 374, 855-861, 1993 

A; Title: Comparison of four grass pollen species concerning their allergens of grass 

A; Reference number: S38288; MUID:94092339 

Accession: S38290 ■! 

A; Molecule type: protein 

A; Residues: 26-45 <PET> 

C; Super family: grass pollen allergen IX 

C; Keywords: pollen • 

F;l-25/Domain: signal sequence tstatus predicted <SIG> 
F;26-308/Product: pollen allergen plb frstatus experimental <MAT> 



Query Match 12,8%; Score 97.5; DB 2; Length 308; 

Best Local Similarity 29.6%; Pred. No. 0,26; 

Matches 42; Conservative 10; Mismatches 49; Indels 41; Gaps 

Qy 3 AVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKKQK 62 

I II : :||l I II: I I I 
Db 39 ATPAATPAGGWREGDDRRAEAAGGRQRLASRQPWPPLPTP 78 

Qy 63 HQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDRSSDRK 122 

III : II I II I III :ll I ::lll I : I 
Db 79 LRRTSSRSSRPPSPSPPRA-SSPTSAAKA PGLIPKL DTAYD- 118 

Qy 123 GGSYKGREALDGRQVTDLRTNP 144 

:!l II .Mill 
Db 119 -VAYKAAEAHPRGQVRRLRHCP 139 



T27852 

hypothetical protein &K418.6 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 Isequence.revision 15-Oct-1999 ttext.change 15-Oct-1999 
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C; Accession: T27852 
R; Fulton, L. 

submitted to the EMBL Data Library, April 1994 

A; Description: The sequence of C. elegans cosmid ZK418. 

A; Reference number: Z20430 

A; Accession: T27852 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;Residues: 1-825 <FDL> 

A;Cross-references: EMBL:U0QQ47; PIDN:AAA50690.1; CESP:ZK418.6 

A; Experimental source: strain Bristol N2 

C; Genetics: 

A; Gene: CESP: ZK418 .6 

Ajintrons: 24/3; 53/3; 98/2; 153/1; 196/3; 269/1; 338/2; 394/2; 444/1; 562/2; 644/3 



Query Match 12.0%; Score 91; DB 2; Length 825; 

ABest Local Similarity 24.8*; Pred. No. 2,7; 
■Matches 40; Conservative 24; Mismatches 57; Indels 40; Gaps 

Qy 2 QAVAAAAEYAGLKVARRQMQDAAGRRHFHAS QCPRPTSP — VSTDSNMSA 49 

I II II h :|::hl :| I I I :| :|: || 

Db 534 QGGAAKDEY- -LEEKKRRLQEAKKKRRHHKSTFTLLFYGDIDRKSSEKLKYISSQSN- ■ ■ 588 

Qy 50 WIQKARPAKKQRHQPGHLRREAYADDLPPPPVPPPAIKSP—TVOSKAOLEVRPVMVP 106 

ll::|: II:: 1 1 1 1 1 : 1 1 I : I : I I 
Db 589 KKRRHR-GATRTRSISSPPLPPPAPPNSFMSPDNOILWCKLYEQTCP— P 635 

Qy 107 KLASIEARTDRSSDRKGGSYKGREALDGRQVTDLRTNPSDP 147 

:|: : I : |: : : || | | 
Db 636 PAKAIDLKMLR NLPGQIGMSELEPEDLEVTPDPP 669 



RESULT . 7 
GNWTR 

genome polyprotein 2 - tomato ringspot virus (strain raspberry) 

N; Contains: coat protein 

C; Species: tomato ringspot virus 

CjDate: 30-Jun-1992 lsequence_revision 30-Jun-1992 itext_change 16-Jun-2000 

C;Accession: JQ1093 

R;Rott, M.E.; Tremaine, J.H.; Rochon, D.M. 

J. Gen. Virol. 72, 1505-1514, 1991 

A; Title; Nucleotide sequence of tomato ringspot virus RNA-2. 

A;Reference number: JQ1093; MUID:91311402 

A; Accession: JQ1093 
gA; Molecule type: genomic RNA 
M Residues: 1-1882 <ROT> 

P Cross -references: GB:D12477; GB:D01129; NID:g222674; PIDN:BAA02043,1; PID:g222675 
A; Note: it is uncertain whether Met-1 or Met-122 is the initiator 
C; Genetics: 

A; Map position: segment 2 

C; Super family; tomato ringspot virus genome polyprotein 
C; Keywords: coat protein; glycoprotein; polyprotein 
F;132M882/Product: coat protein Istatus predicted <MAT> 

F;269,295, 1183, 1316, 1543, 1561, 1735/Binding site: carbohydrate (Asn) (covalent) Istatus 



Query Match 11.7%; Score 89; DB 1; Length 1 

Best Local Similarity 29.1%; Pred. No. 9.7; 



Matches 


Qy 


2 


Db 


*193 


Qy 


59 


Db 


253 


Qy 


118 


Db 


297 



i 1 1 mi mi i 



II: II :| 
-LAAFEAAMNR 296 



RESULT 8 
S39983 

eps8 protein - mouse •' 

C; Species: Mus musculus (house mouse) 

C;Date; 13-Jan-1995 *sequence_revision 13-Jan-1995 ttext_change 05-Nov-1999 
CjAccession: S39983 

R;Fazioli, F.; Minichiello, L.; Matoska, V.; Castagnino, P.; Miki, T.; Wong, W.T.; di 
EMBO J. 12, 3799-3808, 1993 

A;Title: Eps8, a substrate for the epidermal growth factor receptor kinase, enhances 

AjReference number: S39983; MUID: 94008987 

A; Accession: S39983 . 

A; Status: preliminary 

A;Molecule type: mRNA 

A;Residues: 1-821 <FAZ> 

A; Cross -references: EMBL:L21671; NID:g309216; PIDN:AAA16358.1; PID:g309217 
C; Super family: SH3 homology 
F;537-584/Domain: SH3 homology <SH3> 



Query Match 11.6%; Score 88.5; DB 2; Length 821; 

Best Local similarity 24,1%; Pred. No. 4.4; 



Matches 


Qy 


2 


Db 


137 


Qy 


62 


Db 


193 


Qy 


97 


Db 


253 


Qy 


133 


Db 


311 



III I I ■: :| 



I II 



II: I 



Mil! II :IM 



-PVPPPAIRSPTVQSKA" 
III : hi: 



-QLEVRPVM-: 

: I I I 



II II 'I I 



- -VPKL-ASIEARTDRSSDRKGGSYKGREAL 132 
: II : II :: I :l I I : 



RESULT 9 
A60385 

monocyte surface antigen MS2 precursor - mouse 
C; Species: Mus musculus (house mouse) 

C;Date: 03-Feb-1993 fsequencejrevision 03-Feb-1993 ttext.change 26-Aug-1999 
C;Accession: A60385 . 

R;Yoshida, S.; Setoguchi, m.; Higuchi, Y.; Akizuki, S.; Yamamoto, S, 
Int. Immunol, 2, 585.-591, 1990 

A;Title: Molecular cloning of cDNA encoding MS2 antigen, a novel cell surface antiger 

AjReference number: A60385; MUID:91197896 

A; Accession: A60385 ; 

A;Molecule type: mRNA 

A;Residues: 1-826 <YOS> 

A; Cross -references: EMBL:X13335 

C; Super family: mouse-meltrin alpha; disintegrin homology 

C'Keywords: glycoprotein; surface antigen; transmembrane protein 

F;l-14/Domain: signal sequence tstatus predicted <SIG> 

F;402-484/Domain: disintegrin homology <DIS> 

F;659-683/Domain: transmembrane istatus predicted <TMM> 

F;330/Active site: Glu Istatus predicted 



Query Match • 11,4%; Score 87; DB 2; Length 826; 

Best Local Similarity 27.7%; Pred. No. 5,9; 

Matches 43; Conservative 19; Mismatches 53; Indels 40; Gaps 

Qy 4 VAAAAEYAGLXVARRQMQDAAGRRHFHASQCPRPTSPVST DSNMSAWIQKARP 57 

III II:': I: I |: I hi I :| II:: I 
Db 671 VAAMVIVAGIVIIRK APRQIQRRSVAPKPISGLSNPLFYTRDSSL P 716 



58 AKKQKHQPGHLRREAYADDLPPPPV" 

II : I ; I : : II I: 



-PPPAIKSPTVQSKAQLEVRPVMVPKLASIE 112 

WW I I : ! I II II: : : 
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Db 717 AKNRPPDPS— -ETVSTNQPPRPIARPKRPPPA--PPGAVSSSPLPV-PVYAPKIPN-Q 768 

Qy 113 ARTDRSSDRKGGSYKGREALDGRQVTDLRTNPSDP 147 

II: I I :|| I: I 
Db 769 FRPDPPT KPLPELKPKQVKPTFAPPTPP 796 



RESULT 10 
T22560 

hypothetical protein P53C11.5 ■ Caenorhabditis elegans 
C;Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 «sequence_revision 15-Oct-1999 ttext.change 15-Oct-1999 
CjAccession: T22560 
R;Baynes, C. 

submitted to the EMBL Data Library, September 1996 
A; Reference number: Z19581 
AjAccession: T22560 i 
A; Status: preliminary; translated! from GB/EMBL/DDBJ 
A; Molecule type: DNA j 
A; Residues: 1-891 <WIL> 

A; Cross -references: EMBL:Z79756; PION:CAB02122.1; GSPDB:GN00023; CESP:P53C11.5 
^•Experimental source: clone F53C11 
B; Genetics: 
•fljGene: CESP:F53C11.5 

A; Map position: 5 

A;Introns: 25/3; 59/1; 137/1; 287/1; 313/1; 343/3; 420/1; 455/2; 637/2; 708/2; 831/2; 8f 



Query Match 11.4%; 
Best Local Similarity 29.0*; 
29; Conservative 



Score 86.5; DI 
Pred. No. 7.1; 
3; Mismatches 



2; Length 891; 

40; Indels 23; Gaps 3; 



32 SQCPRPTSPVST DSNMSAWIQKARPAKKQKHQPGHLRREAYADDLPPPPVPP 84 

II III: I: I II: : II: II |: : I I II 

463 SQSPRPSQPLQTIAPPPSQSSNIPPPSLPSPRPSALP- -APGFSRQISSATTLTAQAPPP 520 

85 PAIRSPTVQSKAQLEVRPVMVPKLASIEARTDRSSDRKGG 124 

I: II I I II II II 

521 QALTSPRTPS TSASPRTSSRFDRSGG 546 



RESULT 111 
T23757f 

hypothetical protein M117.4 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 tsequence_revision 15-Oct-1999 #text_change 15-Oct-1999 
CjAccession: T23757 
R; Kershaw, J, 
1 submitted to the EMBL Data Library, June 1996 

•Reference number: Z19794 
Accession: T23757 
Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A;Residues: 1-662 <WIL> 

A;Cross-references: EMBL:Z7391Q; PIDN:CAA98136.1; GSPDB:GN00022; CESP:M117.4 
A; Experimental source: clone M117 
C; Genetics: 
A;Gene: CESP:M117.4 
A; Map position: 4 

Ajlntrons: 19/2; 136/3; 281/3; 399/2; 556/3; 602/2 



Query Match 11.2*; Score 85.5; DB 2; Length 662; 

Best Local Similarity 25.4%; Pred. No. 6.3; 

Matches 32; Conservative 17; Mismatches 58; indels 19; Gaps 

0y I 30 HASQCPRPTSPVSTDSNMSAWIQKARPARKQRHQPGHLRREAYADDLPPPPVPPPAIKS 89 

II I III * I :: :| I :| : I Mil II : : 

Db 126 HAHVSPDPTSLNDKRSKFERPNLEPGRPERPERLD RPLAKPLPPPSAAPPPV-A 178 

Qy 90 PTVQSKAQLEVRPVMVP KLASIEARTDRSSDRKGGSYKGREALDGRQVTDLRT 142 

I III I :: : :: Ihll 'III : : 
Db 179 PLSAPVPSTGAPPSAVPPADRKMSKQSTRDKKSGRSTDRNIHRKRDVAALD ENKK 233 



143 NPSDPR 148 
:| I : ■ 
234 SPEDKR 239: 



RESULT 12 ?■ 

S37671 ' . 

MHC class III histocompatibility antigen HLA-B-associated protein 2 [similarity] - hu 
C; Species: Homo sapiens (man) 

C;Date: 20-Feb-1995 tsequence_revision 20-Feb-1995 ftext_change 15-Sep-2000 
C; Accession: S37671 : 
R;Bougueleret, L. 

submitted to the EMBL Data Library, August 1992 

A; Reference number: S37671 

AjAccession: S37671 : 

A; Status : preliminary 

A; Molecule type: DNA. 

A; Residues: 1-1870 <BOU> 

A; Cross-references: EMBL:Z15025; NID:g29374; PID:g29375 
C; Genetics: 

A; Map DOSition: 6p2l',3 

A;lntrbns: 38/2; 97/2; 129/3; 154/1; 202/1; 252/3; 279/2; 327/1; 357/2; 429/3; 588/1; 
CjSuperfamily: collagen alpha 1(IV) chain 



Query Match 11.2%; Score 85.5; DB 2; Length 1870; 

Best local Similarity 25,7%; Pred. No. 19; 



RQMQDAAGRRHFHASQCPRPTSPVSTDS-NMSAWIQKARPAKKQRHQPGHLRREAYADD 76 
:M ::| I I II : II II I : II II 
RQQQQHQWQQHQQGSAPPTPVPPSPPQPVTLGAVPAPKAPPPPPKALYPGALGR 709 



Matches 


Qy 


18 


Db 


656 


Qy 


77 


Db 


710 


Qy 


120 


Db 


759 



■-PPAIRSPTVQSKAQLEVRPV" 
II : :l : I: I 



-MVPKLASIEARTDRSS 113 

:||: I 
3GLVPR ERS 758 



RESULT 13 
S36152 

MHC class III histocompatibility antigen HLA-B-associated protein 2 [similarity] * hu 
C; Species: Homo sapiens (man) 

C;Date: 06-Jun-1995 tsequence_revision 17-Nov-1995 ftexLchange 15-Sep-2000 
CjAccession: S36152, ' 

R;Iris, F.J.M.; Bougueleret, L.; Prieur, s.; Caterina, D. ; Primas, G.; Perrot, v.; Ju 
Nature Genet. 3, 137.-145, 1993 

A; Title: Dense Alu clustering and a potential new member of the NFkappaB family withi 

A;Reference number: S36152; MUID:93272029 

AjAccession: S36152 : 

A; Status: preliminary 

A; Molecule type: DNA' 

A; Residues: 1-1872 <IRI> 

A; Cross -references: EMBL:Z15025 

A; Note: in the authors' translation residues 32-34 are shown after residue 4 and, con 
A; Note: the authors translated the codon AAT for residue 1000 as His 
C; Genetics: 

Ajlntrons: 38/2; 97/2; 129/3; 154/1; 202/1; 252/3; 279/2; 327/1; 357/2; 429/3; 588/1; 
C; Super family: collagen alpha 1(IV) chain 



Query Match 11.2%; Score 85.5; DB 2; Length 1872; 

Best Local Similarity 25.7%; Pred. No. 19; 

Matches 39; Conservative 14; Mismatches 58; Indels 41; Gaps 7; 

Qy 18 RQMQDAAGRRHFHASQCPRPTSPVSTDS-NMSAWIQRARPAKKQKHQPGHLRREAYADD 76 

:! I ::l. I I I I : II II I : II I I 
Db 657 RQQQQHQWQQHQQGSAPPTPVPPSPPQPVTLGAVPAPKAPPPPPKALYPGALGR 710 
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Qy 77 LPPPPVP PPAIKSPTVQSKAQLEVRPV MVPKLASIEARTDRSS 119 

11:1 II : :| : I: I :||: I 
Db 711 - - PPPMPPMNFDPRWMMIPPYVDPRLLQGRPPLDFYPPGVHPSGLVPR ERS 759 

Qy 120 DRKGGSYKGREALDGRQVTDLR- - -TNPSDPR 148 

I -\ I II MINI: 
Db 760 DSRGLS- - -SEPFDRHAPAMLRERGTPPVDPK 788 



RESULT 14 
T49225 

hypothetical protein F27H5.90 - Arabidopsis thaliana 

C ; Species : Arabidopsis thaliana (mouse-ear cress) 

C;Date: 02-Jun-2000 Ssequence_revision 02-Jun-2000 *text_change 02-Jun-2000 

C; Accession: T49225 

R;Rieger, M. ; Mueller-Auer, S,; Zipp, M.; Schaefer, M,; Mewes, H.W.; Rudd, S.; Lemcke, 
submitted to the Protein Sequence Database, April 2000 
ierence number: Z25018 
^cession: T49225 ' 
^Status: preliminary 
A; Molecule type: DNA 
A; Residues: 1-300 <RIE> 

A; Cross -references: EMBL:AL163852; GSPDB:GN00061; ATSP:F27H5 . 90 

A; Experimental source: cultivar Columbia; BAC clone F27H5 

C; Genetics: 

A; Gene: ATSP:F27H5.90 

A; Map position: 3 

Ajintrons: 46/3; 59/3; 105/3; 127/1; 159/3 



1 AQAVAAAMAGLKVARR--QMQDAAGRRHFHASQCPRPTSPVSTDSNMSAVVIQKARPA 58 
II llll l!h III: I I II I I I : I II 
514 AQKAAAAAAMGMSPAARASPMGASPSPHHPHPSQFP-PNHPANP--- MYHHHIMMMRAM 569 

59 KKQKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASI 111 

I llll'' : II : II : I : : | | | :| : 

570 HAQGGQPGH- PGMMPPGMMPPGMMPPGM MPPGMHPGMAGM 608 



Search completed: January 22, 2001, 12:27:33 
Job time: 2130 sec 



Quefy Match 11.2%; Score 85; DB 2; Length 300; 

Best Local Similarly 25,3%; Pred, No. 2.9; 

Matches 40; Conservative 23; Mismatches 51; Indels 44; Gaps 8; 

Qy 26 RRHFHASQCPRPTSPVSTDSNM- -SAWIQKARPAKKQ — KHQPGHLRREAYADDL- • 77 

I: 1 1 : 1 I I:: |::: III : ::||| : I 
Db 137 RKIFHSSDIEHVLDLVGAQSSLQDSSLI PGKDQDPLLQSDSENIRRERFESILKT 191 

Qy 78 PPPPVPPPAIKSPT VQSKAQLEVRPVMVPKLASIEA 113 

I :llll I : :|: I I II I I 

Db 192 QEEKGGLVQPKKNISWPGMYLPPPAHASTSRNEEEEAGESQEQGEEE- — PKEAESET 247 

Qy 114 RTDRSSDRRG-GSYRGREALDGR-QVTDLRTNPSDPR 148 

: I::hl I ::|l II I : : I III 
Db 248 NSSSSTNRRGRGRWRGRGRSRGRGPTVNERKPNSQDPR 285 

HuLT 15 

T32594 

hypothetical protein C02B10.5 - Caenorhabditis elegans 
C; Species; Caenorhabditis elegans 

C;Date: 29-Oct-1999 #sequence_revision 29-Oct-1999 itext_change 29-Oct-1999 

C; Accession: T32594 
, R;Nelson, J.; White, S.; Hawkins, J.; Wohldmann, P. 
j submitted to the EMBL Data Library, December 1997 
j A; Description: The sequence of C. elegans cosmid C02B10. 
I A; Reference number: Z2U96 
I A; Accession: 132594 

A; Status; preliminary; translated from GB/EMBL/DDBJ 
*, A; Molecule type: DNA 

A; Residues: 1-698 <NEL> 

A;Cross-references: EMBL:AF038605; PIDN:AAB92020.1; GSPDB :GN00022; CESP:C02B10.5 
i A; Experimental source: strain Bristol N2; clone C02B10 

C; Genetics: 
j A;Gene: CESP:C02B10.5 
J A; Map position: 4 

A;Introns: 61/3; 102/2; 188/3; 349/2; 641/1 



Query Match 11.2%; Score 85; DB 2; Length 698; 

Best Local Similarity 31.0%; Pred. No. 7.3; 

Matches 35; Conservative 10; Mismatches 48; Indels 20; Gaps 5; 
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34 


77 


10.1 


314 


FTSQJYCTU 


006226 mycobacteri 




GenCore version 4.5 


35 


77 


10.1 


448 


TAU1J0VIN 


P29172 bos taurus 




Copyright (c) 1993 • 2000 Compugen Ltd. 


36 


77 


10.1 


559 


3BP2JOUSE 


Q06649 mus musculu 






37 


77 


10.1 


935 


RNEJAEIN 


P44443 haemophilus 






38 




10,1 


1080 


MI15.CAEEL 


Q23356 caenorhabdi 


OM protein 


protein search, using sw model 


39 




10.1 


1394 


CNG4JOVIN 


Q28181 bos taurus 






40 


76.5 


10.1 


251 


HXB4JUGRU 


013074 fugu rubrip 


Run on: 


January 22, 2001, 12:29:51 ; Search time 162.41 Seconds 


41 


76.5 


10.1 


578 


PSP2JEAST 


P50109 saccharomyc 




(without alignments) 


42 


76.5 


10.1 


678 


ABPPJIPCL 


Q27905 riptortus c 




29.429 Million cell updates/sec 


43 


76.5 


10.1 


1132 


BAT3JUMAN 


P46379 homo sapien 






44 




10.0 


484 


OARUOCMI 


Q25321 locusta mig 


Title: 


US-09-540-245A-20 


45 




10.0 


520 


RXRBJOUSE 


P28704 mus musculu 



Perfect score: 761 



1 AQAVAAAAEYAGLKVARRQM REALDGRQVTDLRTNPSDPR 148 



Scoring table: 



^^rched: 



BLOSUM62 
Gapop 10. 



Gapext 0.5 
88757 seqs, 32294092 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
Maximum Match 1004 
Listing first 45 summaries 



Database : 



SwissProt_39:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



Query 



NO. 


Score 


Match Length 


DB 


ID 


Description! 


1 


97.5 


12.8 


308 


1 


MP5A_LOLPR 


Q40240 lolium pere 


2 


89 


11.7 


1882 


1 


POL2.TRSVR 


P25247 tomato ring 


3 


88.5 


11.6 


821 


1 


EPS8JOUSE 


Q08509 mus musculu 


4 


86 


11.3 


826 


1 


AD08JOUSE 


Q05910 mus musculu 


5 


85 


11.2 


364 


1 


F8I2JOMAN 


P23610 homo sapien 


6 


85 


11.2 


2424 


1 


CCAAJABIT 


P27884 oryctolagus 


7 


84.5 


11.1 


197 


1 


IE68JSV2 


P14379 herpes simp 


8 


83 


10.9 


1319 


1 


SOSlJOUSE 


Q62245 mus musculu 


9 


• 82 


10.8 


601 


1 


3BP1JOUSE 


P55194 mus musculu 


10 


82 


10.8 


797 


1 


FURIJBOVIN 


Q28193 bos taurus 


11 


82 


10.8 


867 


1 


VL96JRV1 


P22856 tipula irid 


12 


81.5 


10.7 


537 


1 


MYPH.CHICK 


Q05623 gallus gall 


13 


81,5 


10.7 


822 


1 


EPS8JDMAN 


Q12929 homo sapien 


14 


81,5 


10.7 


1603 


1 


PSCJJROME 


P35820 drosophila 


15 


80.5 


10.6 


548 


1 


ERFJUMAN 


P50548 homo sapien 


16 


80.5 


10.6 


2142 


1 


BAT2J0MAN 


P48634 homo sapien 


17 


80 


10.5 


127 


1 


GR14JEOCA 


02554 0 neospora ca 


18 


80 


10.5 


383 


1 


DEMAJUMAN 


Q08495 homo sapien 


19 


80 


10.5 


2004 


1 


MOZJUMAN 


Q92794 homo sapien 


20 


79,5 


10.4 


415 


1 


ACRO.PIG 


P08001 sus scrofa 


21 


79 


10.4 


136 


1 


SR19.0RYSA 


P49964 oryza sativ 


22 


79 


10.4 


299 


1 


RL22JDROME 


P50887 drosophila 


23 


79 


10.4 


484 


1 


OAR2.LOCMI 


Q25322 locusta mig 


24 


79 


10.4 


589 


1 


VP40.SCMVC 


Pi 6 04 6 simian cyto 


25 


78.5 


10.3 


180 


1 


US10.VZVD 


P09311 varicella-z 


26 


78.5 


10.3 


190 


1 


PP28JCMVA 


P13200 human cytom 


27 


78.5 


10.3 


304 


1 


IN02JEAST 


P26798 saccharomyc 


28 


78.5 


10,3 


481 


1 


CAP_CHLVR 


P40122 chlorohydra 


29 


78.5 


10,3 


515 


1 


HMSHJROME 


Q03372 drosophila 


30 


78.5 


10,3 


551 


1 


ERFJOUSE 


P70459 mus musculu 


31 


78 


10.2 


1333 


1 


SOS1JUMAN 


Q07889 homo sapien 


32 


77.5 


10.2 


309 


1 


JlLJCMVA 


P17143 human cytom 


33 


77.5 


10.2 


1125 


1 


MAP4_MOtJSE 


P27546 mus musculu 



RESULT 1 
MP5A.LOLPR 

ID MP5A_LOLPR STANDARD; PRT; 308 AA, 
AC Q40240; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 15-JUL-1998 (Rel. 36, Last annotation update) 

DE MAJOR POLLEN ALLERGEN LOL P 5A PRECURSOR (LOL P VA) (LOL P IB), 

GN LOL PIB. 

OS Lolium perenne (Perennial ryegrass). 

OC Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta ; 
OC Magnoliophyta; Lillopsida; Poales; Poaceae; Lolium. 
RN [1] 

RP SEQUENCE FROM N\A. 

RX MEDLINE-91142177; PubMed-1671715; 

RA Singh M.B., Hough T,, Theerakulpisut P., Avjioglu A., Davies S., 
RA Smith P.M., Taylor P., Simpson R.J., Ward L.D., McCluskey J., 
RA Puy R., Knox R.B.; 

rt "isolation of cDNA encoding a newly identified major allergenic 
RT protein of rye-grass pollen: intracellular targeting to the 
RT amyloplast."; . 

Proc. Natl. Acad. Sci. U.S.A. 88:1384-1388(1991). 
-!■ SUBCELLULAR LOCATION: STARCH GRANULE. 
-!- TISSUE SPECIFICITY: POLLEN, STARCH GRANULES . 
-I- DISEASE: CAUSES GRASS POLLEN ALLERGY. 

-!■ SIMILARITY:- BELONGS TO THE POA P IX/PHL P VI FAMILY OF ALLERGENS. 



This SWISS-PRO! entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch), 



EMBL; M59163; AAA33405.1; -. 
INTERPRO; IPR002914; -. 

PFAM; PF01620; Pollen_allerg_2; 1. 
Signal; Allergen; Multigene family. 



FT 


SIGNAL 


1 


25 


POTENTIAL. 


FT 


CHAIN 


26 


308 


MAJOR POLLEN ALLERGEN LOL P 5A 


FT 


DOMAIN 


31 


46 


ALA/PRO/THR-RICH. 


FT 


DOMAIN 


270 


289 


ALA/THR-RICH. 


FT 


DOMAIN 


33 


36 


POLY-ALA. 


FT 


DOMAIN 


270 


278 


POLY -ALA. 


SQ 


SEQUENCE 


308 AA; 


31881 MW 


7756025D09E12FFF CRC64; 



Query Match 12.8%; Score 97.5; DB 1; Length 308; 

Best. Local Similarity 29.6*; Pred. No. 0.25; 

Matches 42; Conservative 10; Mismatches 49; Indels 41; Gaps 

2y 3 AVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKKQK 62 

III I : : :lll I II: I I I 
Db 39 ATPAATPAGGWREGDDRRAEAAGGRQRLASRQPWPPLPTP 78 



Mon Jan 22 13:05:00 2001 



us-09-540-245a-20.rsp 



Page 2 



Oy 63 HQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDRSSDRK 122 

III : II I II I III :|| | ::IM I : I 

Db » 79 LRRTSSRSSRPPSPSPPRA-SSPTSAAKA PGLIPRL DTAYD-- 118 

Qy 123 GGS5fKGREALDGftQVTDLRTNP 144 

:| II II II I 
Db 119 -VAYKAAEAHPRGQVRRLRHCP 139 



RESULT 2 
POL2JRSVR 
ID 
AC 
DT 
DI 
DT 
DE 



PRT; 1882 AA. 



POL2 TRSVR STANDARD; 
P25247; 

01-MAY-1992 (Rel. 22, Created) 

01-MAY-1992 (Rel. 22, Last sequence update) 

01-NOV-1995 (Rel. 32, Last annotation update) 

RNA2 POLYPROTEIN (207 KDA PROTEIN) [CONTAINS: COAT PROTEIN] , 
OS Tomato ringspot virus (isolate raspberry) (Tomrsv). 
OC Viruses; ssRNA positive- strand viruses, no DNA stage; Comoviridae; 
OC Nepovirus. 

«RN [1] 
SEQUENCE FROM N.A, 
MEDLINE-91311402; PubMed-1856689; 
Rott M.E., Tremaine J.H., Rochon D.M.; 
rt "Nucleotide sequence of tomato ringspot virus RNA-2,"; 
RL J. Gen. Virol. 72:1505-1514(1991). 

CC -!- SIMILARITY: IDENTICAL FOR THE FIRST 132 AA, AND 75.3% IDENTICAL 
CC FOR THE NEXT 145 AA TO THE RNA1 POLYPROTEIN. 
CC -!- SIMILARITY: TO THE RNA2 POLYPROTEIN OF OTHER NEPOVIRUSES. 
CC -!■ CAUTION: IT IS UNCERTAIN WHETHER MEM OR MET-122 IS THE 
CC INITIATOR. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 
CC between the Swiss Institute of Bioinformatics and the EMBL outstation - 
CC the European Bioinformatics Institute, There are no restrictions on its 
CC use by non-profit institutions as long as its content is in no way 
CC modified and this statement is not removed. Usage by and for commercial 
CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
CC or send an email to license@isb-sib.ch). 
cc 

DR EMBL; D12477; BAA02043.1; -. 

DR PIR; JO1093; GNWTR. 

KW Polyprotein; Coat protein; Repeat, 

FT CHAIN 1321 1882 COAT PROTEIN (POTENTIAL). 

FT DOMAIN 

FT REPEAT 

FT REPEAT 



554 698 

554 606 

607 659 

660 698 



2.5 X TANDEM REPEATS, PRO-RICH. 
1. 
2. 

3 (INCOMPLETE AND APPROXIMATE). 



SQ SEQUENCE 1882 AA; 206802 MW; 0F8958B63AE8DD9D CRC64 ; 



W Query Match 11.7*; Score 89; DB 1; Length 1882; 

Best Local Similarity 29.1%; Pred. No. 7.8; 

Matches 37; Conservative 10; Mismatches 60; Indels 20; Gaps 3; 

Qy 2 QAVAAAAEYAGLKVARRQMQDAAGR— RHFHASQCPRPTSPVSTDSNMSAWIQKARPA 58 

:| I II I : I I II I : I I I : I : II 
Db 193 KAAAIiAAVKMQEAPRIJ\AQKAAISKILRDRDVAALPPPPPPSAARIAAEAELASRAESL 252 

Qy 59 KKQKHQPGHLR-REAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDR 117 

I I I I llll Nil I II: II :l 

Db 253 RRLKAFKTFSRVRPALNTSFPPPPPPPPARSSEL LAAFEAAMNR 296 

Qy 118 SSDRKGG 124 

I :|| 
Db 297 SQPVQGG 303 



RESULT 3 
EPS8_MOUSE 

ID EPS8J10USE STANDARD; PRT; 821 AA. 
AC Q08509; 

DT Ol-NOV-1997 (Rel. 35, Created) 



01-NOV-1997 (Rel. 35, Last sequence update) 
15-DEC-1998 (Rel. 37, Last annotation update) 
EPIDERMAL GROWTH FACTOR RECEPTOR KINASE SUBSTRATE EPS8. 
EPS8. ': 
Mus musculus (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
[1] 

SEQUENCE FROM N.A. 
MEDLINE-94008987; PubMed-8404850; 

Fazioli F., Minichiello L, Matoska v., Castagnino P., Miki T., 
Wong W.T., di Fiore P.P.; 

"Eps8, a substrate for the epidermal growth factor receptor kinase, 
enhances EGF-dependent mitogenic signals."; 
EMBO J. 12:3799:3808(1993). 
[2] 

X-RAY CRYSTALLOGRAPHY (1.5 ANGSTROMS) OF 532-591. 
MEDLINE-97448677; PubMed-9303002; 

Kishan K.V:R-., Scita G., Wong W.T., di Fiore P.P., Newcomer M.E.; 
"The SH3 domain : of Eps8 exists as 1 a novel intertwined dimer."; 
Nat. Struct. .Biol, 4:739-743(1997). 

-I- FUNCTION: UPON BINDING TO EGF RECEPTOR ENHANCES EGF-DEPENDENT 

MITOGENIC SIGNALS. CAN BIND MULTIPLE CELLULAR TARGETS. 
-I- PTM: PHOSPHORYLATED BY SEVERAL RECEPTOR TYROSINE KINASES. 
-I- SIMILARITY: 1 CONTAINS 1 SH3 DOMAIN. 
-I- SIMILARITY: CONTAINS 1 SPLIT PH DOMAIN. 



This SWISS-PROT' entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch) . 

EMBL; L21671; AM16358.1; -. 
PDB; 1AOJ; 08-JUL-98. 
MGD; MGI: 104684; EPS8. 
INTERPRO; IPR001452; -. 
PFAM; PF00018; SH3; 1. 
PROSITE; PS50002; SH3; 1. 
SH3 domain; Phosphorylation; 3D-structure. 



FT 


DOMAIN 


69' 


129 


PH (FIRST PART). 


FT 


DOMAIN 


210 


213 


POLY-PRO. 


FT 


DOMAIN 


322 


325 


POLY -PRO. 


FT 


DOMAIN 


381 


414 


PH (SECOND PART) . 


FT 


DOMAIN 


421. 


440 


PRO-RICH. 


FT 


DOMAIN 


532 


591 


SH3. 


FT 


DOMAIN 


620 


650 


PRO-RICH. 


FT 


DOMAIN 


658 


663 


POLY-SER. 


SQ 


SEQUENCE 


821 AA; 


91738 MW 


6B9EB95DD22D910C 



Query Match 6 



11.6%; Score 88.5; DB 1; Length 821; 



Matches 


Qy 


2 


Db 


137 


Qy 


62 


Db 


193 


Qy 


97 


Db 


253 


Qy 


133 


Db 


311 



t Local Similarity 24.1%; Pred. No. 3.7; 



III I I : :| 



I II 



I :| II I II Mil 



■-PVPPPAIKSPTVQSKA-- 
III : |:h 



I I I 



- -VPKL- AS IEARTDRSSDRKGGSYKGREAL 132 
: II : II I :| I I : 



Best Available Copy 
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RESULT 4 
AD08JIOUSE 



PRT; 826 AA. 



AD08JOUSE 
Q05910; 

01-NOV-1995 (Rel. 32, Created) 
Ol-OCT-1996 (Rel. 34, Ust sequence update) 
30-MAY-2000 (Rel. 39, Last annotation update) 
ADAM 8 PRECURSOR (EC 3.4.24.-) (A DISINTEGRIN AND METALLOPROTEINASE 
DOMAIN 8) (CELL SURFACE ANTIGEN MS2) (MACROPHAGE CYSTEINE-RICH 
GLYCOPROTEIN) (CD156 ANTIGEN) . 
ADAM8 OR MS2 . 
Mus musculus (Mouse). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
[1] 

SEQUENCE FROM N.A. 
STRAIN-ICR; 

Yamamoto S., Yoshiyama K., Setoguchi M., Matsuura L, Higuchi Y., 
Akizuki S.; 

Submitted (JAN-1996) to the EMBL/GenBank/DDBJ databases. 
[2] 

SEQUENCE FROM N.A. 
STRAIN-BALB/C; TISSUE-LIVER; 
MEDLINE=973 64747; PubMed-9218457; 
Kataoka M., Yoshiyama K., Matsuura K., Hijiya N., Higuchi Y,, 
Yamamoto S.; 

"Structure of the murine CD156 gene, characterization of its 
promoter, and chromosomal location."; 
J. Biol. Chem. 272:18209-18215(1997). 
[3] 

PRELIMINARY SEQUENCE FROM N.A. 
STRAIN-ICR; 

MEDLINE=91197896; PubMed-1982220; 
Yoshida S., Setoguchi M., Higuchi Y., Akizuki S., Yamamoto S.; 
"Molecular cloning of cDNA encoding MS2 antigen, a novel cell surface 
antigen strongly expressed in murine monocytic lineage/; 
Int. Immunol. 2:585-591(1990). , 

*l- FUNCTION: POSSIBLE INVOLVEMENT IN EXTRAVASATION OF LEUKOCYTES. 

■I- SUBCELLULAR LOCATION: TYPE I ljlEMBRANE PROTEIN. 

-!• TISSUE SPECIFICITY: MACROPHAGES. 

-!■ SIMILARITY: BELONGS TO PEPTIDASE FAMILY M12B (ZINC 

METALLOPROTEASE); ALSO KNOWN AS THE REPROLYSIN SUBFAMILY. 
■!• SIMILARITY: CONTAINS A DISINTSGRIN DOMAIN. 

This SWISS-PROT entry is copyright, It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation ■ 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; D10911; BAA21771.1; •. 
EMBL; X13335; CAA31712.1; -. 
HSSP; P18619; 1FVL. 
MEROPS; M12.208; ■. 
MGD; MGI:107825; ADAM8. 
INTERPRO; IPR000130; -. 
INTERPRO; IPR000561; -. 
INTERPRO; IPR001590; 
INTERPRO; IPR001762; -. 
INTERPRO; IPR002870; 
PFAM; PF01562; PepJ12B_propep; 1. 
PFAM; PF01421; Reprolysin; 1. 

PFAM; PF00200; disintegrin; 1, 

PROSITE; PS50215; ADAMJffiPRO; 1. 

PROSITE; PS00427; DISINTEGRIN.l; FALSEJEG. 

PROSITE; PS50214; DISINTEGRINJ; 1, 

PROSITE; PS00022; EGF.l; UNKNOWN.1. 

PROSITE; PS01186; BGPJ; UNKNOWNJ, 

PROSITE; PS00142; ZINC.PROTEASE; 1. 

Transmembrane; Glycoprotein; Antigen; zinc; Hydrolase; 



KW 


Metalloprotease; Signal . 




FT 


SIGNAL 


1 


16 


POTENTIAL. 


FT 


CHAIN 


17 ' 


826 


ADAM 8. 


FT 


DOMAIN 


17 


658 


EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


659 ■ 


683 


POTENTIAL. 


FT 


DOMAIN 


684 


826 


CYTOPLASMIC (POTENTIAL). 


FT 


METAL 


329 


329 


ZINC (CATALYTIC) (PROBABLE), 


FT 


ACT.SITE 


330 


330 


BY SIMILARITY, 


FT 


METAL 


333 


333 


ZINC (CATALYTIC) (PROBABLE). 


FT 


METAL 


339 • 


339 


ZINC (CATALYTIC) (PROBABLE). 


FT 


CARBOHYD 


89 ' 


89 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


260 


260 


N- LINKED (GLCNAC, . ,) (POTENTIAL). 


FT 


CARBOHYD 


431 


431 


N- LINKED (GLCNAC, . .) (POTENTIAL). 


FT 


CARBOHYD 


614 


614 


N- LINKED (GLCNAC. . .) (POTENTIAL). 


SQ 


SEQUENCE 


826 AA; 


90046 MW 


3142CC81DBBADFB9 CRC64; 



Query Match 11.3%; Score 86; DB 1; Length 826; 

Best Local Similarity 27.7%; Pred. No. 5.8; 

Matches 43; Conservative 19; Mismatches 53; Indels 40; Gaps 9; 

Qy 4 VAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVST DSNMSAWIQKARP 57 

II! II: : I: I I: I hi I :l II:: I 
Db 671 VAAMVIVAGIVIYRK APRQIQRRSVAPKPISGLSNPLFYTRDSSL P 716 

Qy 58 AKKQKHQPGHLRREAYADDLPPPPV PPPAIKSPTVQSKAQLEVRPVMVPKLASIE 112 

II : I 'I : : II I: INI I 

Db 717 AKNRPPDPS— ETVSTNQPPRPIVKPKRPPPA--PPGAVSSSPLPV-PVYAPKIPN-Q 768 

Qy 113 ARTDRSSDRKGGSYKGREALDGRQVTDLRTNPSDP 147 

II: I I :H I: I 
Db 759 FRPDPPT KPLPELKPKQVKPTFAPPTPP 796 



RESULT 5 
F8I2JUMAN 

ID , F8I2JUMAN STANDARD; PRT; 364 AA. 
AC P23610; 

DT 01-8OV-1991 (Rel, 20, Created) 

DT 01-NOV-1991 (Rel. 20, Last sequence update) 

DT 01-AUG-1992 (Rel, 23, Last annotation update) 

DE FACTOR VIII INTRON 22 PROTEIN (CPG ISLAND PROTEIN) , 

GN F8A. 

OS Homo sapiens (Human). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
OC Mammalia; Eutheria; Primates; Catarrhini ; Hominidae; Homo. 
RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE-LIVER; > 

RX MEDLINE-90243242; PubMed-2110545; 

Levinson B,, Kenwrick S., Laklch D., Hammonds G, Jr., Gitschier J,; 
"A transcribed gene in an intron of the human factor VIII gene."; 
Genomics 7:1-11.(1990). 

■I- FUNCTION: NOT KNOWN. POSSIBLE HOUSEKEEPING ROLE, 
-I- TISSUE' SPECIFICITY: PRODUCED ABUNDANTLY IN A WIDE VARIETY OF 
CELL TYPES. ' 



RA 
RT 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 

DR EMBL; M34677; AAA35713.1; -. 
DR PIR; A34579; A34579. 
DR MIM; 305423; 

364 AA; 38647 MW; 



This SWISS-PROT .entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation • 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed, Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to licenseSisb-sib.ch), 



2C2B78DA0F2A820B CRC64; 



Query Match ';. 11.2%; Score 85; DB 1; Length 364; 
Best Local Similarity 27.3%; Pred. No. 3; 



Best Available Copy 
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Matches 39; Conservative 15; Mismatches 53; Indels 36; Gaps 6; 



Matches 


Qy 




Db 


136 


Qy 


49 


Db 


196 


Qy 


104 


Db 


239 



IIIMII llhl I I:: 



- -AWIQKARPAKKQKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPV 103 



II 



I I:: 



I 



Mill I II 
--VQSLPPPPPPAPA-- 



:|: : II 
--RARGDARP- 238 



II I I 



I h: 



RESULT 6 

CCAAJABIT | 
ID CCM RABIT STANDARD; PRT; 2424 AA, j 
AC P27884; P27883; 
DT 01-JUL-1993 (Rel. 26, Created) 

DT 01-JUL-1993 (Rel. 26, Last sequence update) ! 

•30-MAY-2000 (Rel. 39, Last annotation update) 1 
VOLTAGE-DEPENDENT P/Q-TYPE CALCIUM CHANNEL ALPHA -1A SUBUNIT (CALCIUM 
CHANNEL, L TYPE, ALPHA- 1 POLYPEPTIDE ISOFORM 4) (BRAIN CALCIUM CHANNEL 
I) (BI). 

CACNA1A OR CACNL1A4 OR CACH4 OR CACN3 , 
Oryctolagus cuniculus (Rabbit). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Lagomorpha; Leporidae; Oryctolagus. 

[1] 

SEQUENCE FROM N.A. 
TISSUE-BRAIN; | 
MEDLINE-91187110; PubMed-1849233; 

Mori Y,, Friedrich T., Kim M;.-S., Mikami A., Nakai J., Ruth P., 
Bosse E. , Hofmanrt F. , Flockerzi v., Furuichi T., Mikoshiba K., 
Intoto K., Tanabe T., Numa S.; 

"Primary structure and functional expression from complementary ijNA 
of a brain calcium channel." 



RL Nature 350:398-402(1991) 

RN [2] 

RP BETA- SUBUNIT BINDING DOMAIN, I AND MUTAGENESIS. 

RX MEDLINE-94150724; PubMed-7509046; 

RA Pragnell M., de Waard M., Mori Y., Tanabe T., Snutch T.P., 

RA Campbell K.P.; : 

RT "Calcium channel beta-subunit binds to a conserved motif in the I- II 

RT cytoplasmic linker of the alpha 1-subunit."; J 

RL Nature 368:67-70(1994). I 

CC -!* FUNCTION; VOIMJE-SENSITIVE CALCIUM CHANNELS (VSCC) MEDIATE THE 
CC ENTRY OF CALCIUM IONS INTO EXCITABLE CELLS AND ARE ALSO INVOLVED 

fIN A VARIETY OF CALCIUM-DEPENDENT PROCESSES, INCLUDING MUSCLE; 
CONTRACTION, HORMONE OR NEUROTRANSMITTER RELEASE, GENE EXPRESSION, 
CELL MOTILITY, CELL DIVISION AND CELL DEATH. THE ISOFORM ALPHA- 1A 
CC GIVES RISE TO P AND/OR Q-TYPE CALCIUM CURRENTS. P/Q-TYPE CALCIUM 
CC CHANNELS BELONG TO THE "HIGH-VOLTAGE ACTIVATED" (HVA) GROUP AND 
CC ARE BLOCKED BY THE FUNNEL TOXIN (FIX) AND BY THE OMEGA-AGATOXIN- 
CC IVA (OMEGA- AGA-IVA) . THEY ARE HOWEVER INSENSITIVE TO 
CC DIHYDROPYRIDINES (DHP), AND OMEGA-CONOTOXIN-GVIA (OMEGA-CTX- 
CC GVIA) . 

CC -!- SUBUNIT; VOLTAGE-DEPENDENT CALCIUM CHANNELS ARE MULTISUBUNIT 
CC COMPLEXES, CONSISTING OF ALPHA-1, ALPHA- 2, BETA AND DELTA SUBUNITS 
CC IN A 1:1:1:1 RATIO. THE CHANNEL ACTIVITY IS DIRECTED BY THE PORE- 
CC FORMING AND VOLTAGE-SENSITIVE ALPHA-1 SUBUNIT. IN MANY CASES, THIS 
CC SUBUNIT IS SUFFICIENT TO GENERATE VOLTAGE-SENSITIVE CALCIUM 
CC CHANNEL ACTIVITY. THE AUXILIARY SUBUNITS BETA AND ALPHA- 2/DELTA 
CC LINKED BY A DISULFIDE BRIDGE REGULATE THE CHANNEL ACTIVITY. 

CC -!- SUBCELLULAR LOCATION: INTEGRAL MEMBRANE PROTEIN. 

CC -!- ALTERNATIVE PRODUCTS: IN THE BRAIN, A SHORT ISOFORM BH/lA-1 AND 
CC A LONG ISOFORM BI-2/1A-2 (SHOWN HERE), ARE PRODUCED BY ALTERNATIVE 
CC SPLICING. 

CC -!- TISSUE SPECIFICITY: BRAIN- SPECIFIC. PURKINJE CELLS CONTAIN 

CC PREDOMINANTLY P-TYPE VSCC, THE Q-TYPE BEING A PROMINENT CALCIUM 

CC CURRENT IN CEREBELLAR GRANULE CELLS. 

CC -!• DOMAIN: EACH OF THE FOUR INTERNAL REPEATS CONTAINS FIVE 



CC 


HYDROPHOBIC. T 


RANSMEMBRAN1 


E SEGMENTS (SI, S2, S3, S5, S6) AND ONE 




CC 


POSITIVELY CHARGED TRANSMEMBRANE SEGMENT (S4). S4 SEGMENTS 




CC 


PROBABLY REPRESENT THE VOLTAGE -SENSOR AND ARE CHARACTERIZED BY A 




CC 


SERIE 


S OF POSITIVELY CHARGED AMINO ACIDS AT EVERY THIRD POSITION. 




CC 


-1- SIMILARITY: 'BELONGS TO THE CALCIUM CHANNEL ALPHA-1 SUBUNITS 




CC 
CC 


FAMILY. 


















CC 


This SWISS-PROT entry is copyright. It is produced through a collaboration 


CC 


between 


the Swiss Institute of Bioinformatics and the EMBL outstation - 


CC 


the European Bioinformatics Institute. There are no restrictions on 


its 


CC 


use by 


non-profit institutions as long as its content is in no 


way 


CC 


modified and this statement is not removed. Usage by and for corner 


ciai 


CC 


entities requires 


a license agreement (See http://www.isb-sib.ch/announce/ 


CC 
CC 
DR 


or send an email 


to license? 


isb-sib.ch). 




EMBL; X57477; CAA40715.1; -, 






DR 


EMBL; X57689; CAA40872.1; -. 






DR 
DR 


EMBL; X57476; CAA40714.1; -. 
EMBL; X57688; CAA40871.1; -. 






DR 


INTERPRO; 


IPR000636; -. 






DR 


INTERPRO; IPR002077; -. 






DR 


PFAM; PF00520; ion.trans; 4. 






DR 


PRINTS; PR00167; 


CACHANNEL. 






KW 


Ionic channel; Tr 


ansmembrane 


• Ion transport; Voltage-gated channel; 




KW 
KW 


Calcium channel; Glycoprotein; Repeat; Multigene family; 
Calcium-binding; Phosphorylation; Alternative splicing. 




FT 


REPEAT 


85 


363 


I. 




FT 


REPEAT 


473' 


717 


II. 




FT 


REPEAT 


1240 


1523 


III. 




FT 


REPEAT 


1560 


1823 


IV. 




FT 


DOMAIN 


1- 


98 


CYTOPLASMIC (POTENTIAL). 




FT 


TRANSMEM 


99' 


117 


SI OF REPEAT I (POTENTIAL) , 




FT 


DOMAIN 


118- 


135 


EXTRACELLULAR (POTENTIAL) . 




FT 


TRANSMEM 


136 


155 


S2 OF REPEAT I (POTENTIAL). 




FT 


DOMAIN 


156: 


167 


CYTOPLASMIC (POTENTIAL). 




FT 


TRANSMEM 


168 


185 


S3 OF REPEAT I (POTENTIAL). 




FT 


DOMAIN 


186 


190 


EXTRACELLULAR (POTENTIAL), 




FT 


TRANSMEM 


191-. 


209 


S4 OF REPEAT I (POTENTIAL), 




FT 


DOMAIN 


210 


228 


CYTOPLASMIC (POTENTIAL). 




FT 


TRANSMEM 


229 


248 


S5 OF REPEAT I (POTENTIAL) , 




FT 


DOMAIN 


249 


335 


EXTRACELLULAR (POTENTIAL), 




FT 


TRANSMEM 


336'' 


360 


S6 OF REPEAT I (POTENTIAL), 




FT 


DOMAIN 


361 


487 


CYTOPLASMIC (POTENTIAL). 




FT 


TRANSMEM 


488 


506 


SI OF REPEAT II (POTENTIAL) . 




FT 


DOMAIN 


507" 


521 


EXTRACELLULAR (POTENTIAL). 




FT 


TRANSMEM 


522: 


541 


S2 OF REPEAT II (POTENTIAL). 




FT 


DOMAIN • 


542. 


549 


CYTOPLASMIC (POTENTIAL). 




FT 


TRANSMEM 


550. 


568 


S3 OF REPEAT II (POTENTIAL). 




FT 


DOMAIN 


569' 


578 


EXTRACELLULAR (POTENTIAL). 




FT 


TRANSMEM 


579' 


597 


S4 OF REPEAT II (POTENTIAL) . 




FT 


DOMAIN 


598'' 


616 


CYTOPLASMIC (POTENTIAL) . 




FT 


TRANSMEM 


617 


636 


S5 OF REPEAT II (POTENTIAL). 




FT 


DOMAIN 


637. 


689 


EXTRACELLULAR (POTENTIAL). 




FT 


TRANSMEM 


690 


714 


S6 OF REPEAT II (POTENTIAL) . 




FT 


DOMAIN 


715' 


1253 


CYTOPLASMIC (POTENTIAL). 




FT 


TRANSMEM 


1254 


1272 


SI OF REPEAT III (POTENTIAL) . 




FT 


DOMAIN 


1273' 


1288 


EXTRACELLULAR (POTENTIAL). 




FT 


TRANSMEM 


1289' 


1308 


S2 OF REPEAT III (POTENTIAL) . 




FT 


DOMAIN 


1309 


1320 


CYTOPLASMIC (POTENTIAL). 




FT 


TRANSMEM 


1321.- 


1339 


S3 OF REPEAT III (POTENTIAL). 




FT 


DOMAIN 


1340, 


1350 


EXTRACELLULAR (POTENTIAL). 




FT 


TRANSMEM 


1351 


1369 


S4 OF REPEAT III (POTENTIAL). 




FT 


DOMAIN 


'1370 


1388 


CYTOPLASMIC (POTENTIAL). 




FT 


TRANSMEM 


1389: 


1408 


S5 OF REPEAT III (POTENTIAL). 




FT 


DOMAIN 


1409' 


1495 


EXTRACELLULAR (POTENTIAL). 




FT 


TRANSMEM 


1496' 


1520 


S6 OF REPEAT III (POTENTIAL). 




FT 


DOMAIN 


1521 


1575 


CYTOPLASMIC (POTENTIAL). 




FT 


TRANSMEM 


1576' 


1604 


SI OF REPEAT IV (POTENTIAL). 




FT 


DOMAIN 


1605. 


1609 


EXTRACELLULAR (POTENTIAL), 




FT 


TRANSMEM 


1610 


1629 


S2 OF REPEAT IV (POTENTIAL). 




FT 


DOMAIN 


1630', 


1637 


CYTOPLASMIC (POTENTIAL). 




FT 


TRANSMEM 


1638' 


1656 


S3 OF REPEAT IV (POTENTIAL). 




FT 


DOMAIN 


1657 


1665 


EXTRACELLULAR (POTENTIAL). 
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FT TRANSMEM 1666 1684 

FT DOMAIN 1685 1703 

FT TRANSMEM 1704 1723 

FT DOMAIN 1724 1795 

FT TRANSMEM 1796 *1820 

FT DOMAIN 1821 2424 



FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

FT DOMAIN 

• SITE 
SITE 

FT 

FT SITE 
FT 

FT SITE 
FT 

FT MOD.RES 

FT CAJIND 

FT CARBOHYD 



13 
727 



18 
732 



1004 1010 

1012 1017 

2219 2227 

2242 2246 

2288 2297 

2298 2301 

2372 2377 

2411 2416 



383 
318 



400 
318 



668 668 
1469 1469 



1765 1765 



54 OF REPEAT IV (POTENTIAL), 
CYTOPLASMIC (POTENTIAL). 

55 OF REPEAT IV (POTENTIAL), 
EXTRACELLULAR (POTENTIAL) . 

56 OF REPEAT IV (POTENTIAL). 
CYTOPLASMIC (POTENTIAL) , 

i POLY-GLY. 
POLY-GLU, 
POLY-GLY. 
POLY-ARG. 
POLY-HIS. 
POLY-ARG, 
POLY-ARG. 
POLY-GLY, 

POLY-PRO. : 

POLY-GLY. I 

BINDING TO THE BETA SUBUNIT, 

CALCIUM ION SELECTIVITY AND PERMEABILITY 

(BY SIMILARITY) . 

CALCIUM ION SELECTIVITY AND PERMEABILITY 
(BY SIMILARITY) . 

CALCIUM ION SELECTIVITY AND PERMEABILITY 
(BY SIMILARITY) . 

CAIf IUM ION SELECTIVITY AND PERMEABILITY 



1831 1831 
1849 1860 



(BMSIMILARITY) . 
PHC6PH 



283 283 
FT CARBOHYD 1665 1665 
FT ' VARSPLIC 
FT VARSPLIC 



772 1051 
772 1120 



FT VARSPLIC 1857 
FT 
FT 

FT VARSPLIC 2230 2273 
FT 
FT 

FT VARSPLIC 2274 2424 



FT VARIANT 

FT VARIANT 

FT VARIANT 

FT MUTAGEN 

FT MUTAGEN 

FT MUTAGEN 

FT MUTAGEN 



IPHORYLATION (BY CAPK) (POTENTIAL) . 
BY SIMILARITY. 

N-LINKED (GLCNAC. . .) (POTENTIAL) , 
N-LINKED (GLCNAC, . .) (POTENTIAL) . 
MISSING (IN ISOFORM CBP107). 
MISSING (IN ISOFORM CBP103). 
LYRDMYAMLRHMPPPLGLGKNCPARVAY -> HYKDMYSLL 
RVISPPLGLGKKCPHRVAC (IN ISOFORM 
CBP101/CBP109) . 

RGPGRVSPGVSARRRRRGPVARVRPARAPALAHARARARAP 
ARL -> PAAADKERYGPQDRPDHGHGRARARDQRWSRSPS 
EGREHTTHRQ (IN ISOFORM BI-l/lA-1). 
MISSING (IN ISOFORM BI-l/lA-1). 
MISSING (IN ISOFORM CBP315). 
A -> T (IN ISOFORM CBS). 
S -> N (IN ISOFORM CBS). 
E->S: REDUCED BETA-SUBUNIT INTERACTION. 
L->H: REDUCED BETA-SUBUNIT INTERACTION. 
Y->S: REDUCED BETA-SUBUNIT INTERACTION. 
E->A: NO EFFECT ON BETA-SUBUNIT 
INTERACTION. 
2424 AA; 273228 MW; F7CC4D0AB4B45604 CRC64; 



419 
877 



419 
877 



1104 1104 
386 386 
389 389 
392 
400 



392 
400 



Query Match 



11.21; Score 85; DB 1; Length 2424; 
Best Local Similarity 25.6*; Pred. No. 21; 
Matches 34; Conservative 15; Mismatches 46; Indels 38; Gaps 

Qy 18 RQMQDAAGRRHFHASQCPRPTSPVSTDS— NMSAWIQKARPAKKQKHQPG 66 

I: : II : I I |:: II I : : Mil |: II 
Db 2294 RRRRGGGGRA— LRRAPGPREPLAQDSPGRGPSVCLARAARPAGPQRLLPGPRTGQAPR 2350 

Qy 67 HLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLASIEARTDR 117 

::M I III III :| I |: ::|: ! 

Db 2351 ARLPQKPARSVQRERRGLVLSPPP - PPPGELAP RAHPARTPRPGPGDSRSRR 2401 

Qy 118 SSDR KGG 124 

.1 Ml 
Db 2402 GGRRWTASAGKGG 2414 



RESULT 7 
IE68JSV2 

ID IE68JSV2 STANDARD; PRT; 197 AA. 
AC P14379; 

DT 01-JAN-1990 (Rel. 13, Created) 

DT 01-JAN-1990 (Rel. 13, Last sequence update) 

DT 01-DEC-1992 (Rel. 24, Last annotation update) 



IMMEDIATE -EARLY PROTEIN IE4 (IE68) (FRAGMENT). 
US1. 

Herpes simplex virus (type 2). 

Viruses; dsDNA viruses, no RNA stage; Herpes viridae; 

Alphaherpesvirinae; Simplexvirus . 

[1] 

SEQUENCE FROM N.A. 
MEDLINE-84137573; PubMed-6321634; 
Whitton J.L., Clements J.B.; 

"The junctions between the repetitive and the short unique sequences 
of the herpes simplex virus genome are determined by the polypeptide* 
coding regions of two spliced immediate-early mRNAs . " ; 
J. Gen. Virol. 65:451-466(1984). 

•!• SIMILARITY:' BELONGS TO A FAMILY THAT GROUP TOGETHER HSV-1 AND 
HSV-2 IE-68 (US1), EHV-1 65, EHV-4 (ORF4), PRV RSP40, AND VZV 63. 

This SWISS-PROT -entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions, as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http;//ww,isb-sib.ch/announce/ 
or send an email to licenseUisb-sib.ch) . 

EMBL; M29384; AAA45848.1; -. 
Early protein. ■ 
NON.TER 197 197 

197 AA; 21510 MW; 314C23F55C795CBB CRC64; 



Query Match 11.11; Score 84.5; DB 1; Length 197; 

Best Local Similarity 26.5%; Pred. No. 1.8; 

Matches 26; Conservative 12; Mismatches 33; Indels 27; Gaps 3; 



Qy 



76 DLPP - PPVPPPAIKS--PTVQSKAQLEVRPVMVPKLASIEARTDRSS 119 

1:11 llll: I : I :: I :| : : :| I: I 

3 DIPPDPPALDTTPANHAPPSPLPGSRKRRRPVLPSSSESEGKPDIESESSSTESSEDEVG 62 

120 DRKGGSYKGREALDGRQVTDLR TNPSD 146 

I :|| : I II II I III 

63 DLRGGRRRPPRELGGRYFLDLSAESTTGTESEGTGPSD 100 



ID SOSlJiOUSE STANDARD; PRT; 1319 AA. 

AC Q62245; Q62244 ; 

DT 15-JUL-1999 (Rel. 38, Created) 

DT 15-JDL-1999 (Rel. 38, Last sequence update) 

DT 30-MAY-2000 (Rel. 39, Last annotation update) 

DE SON OF SEVENLESS PROTEIN HOMOLOG 1 (SOS-1) (MSOS-1). 

GN SOS1. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=SWISS; TISSUE-EYE; 

RX MEDLINE-92335323; PuiMed-1631150; 

RA Bowtell D., Fu P., Simon M., Senior P.; 

RT "Identification of murine homologues of the Drosophila son of 

RT sevenless gene: potential activators of ras."; 

RL Proc. Natl. Acad. Sci. U.S.A. 89:6511-6515(1992). 

RN [2] 

RP STRUCTURE BY NMR OF 415-548. 

RX MEDLINE-97360234; PubMed-9217262; 

RA Koshiba S., Kigawa T., Kim J.-H., Shirouzu M, , Bowtell D,, 

RA Yokoyama S.; 

RT "The solution structure of the pleckstrin homology domain of mouse 

RT Son -of -sevenless 1 (mSosl)."; 

RL J. MOl. Biol, 269:579-591(1997), 

CC -!- FUNCTION; PROMOTES THE EXCHANGE OF RAS-BOUND GDP BY GTP (BY 
CC SIMILARITY), 



Best Available Copy 
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CC -!- TISSUE SPECIFICITY: EXPRESSED IN MOST EMBRYONIC AND ADULT TISSUES. 
CC -!- SIMILARITY: CONTAINS 1 DBL-HOMOLOGY DOMAIN (DH). 
CC -!- SIMILARITY: CONTAINS 1 PH DOMAIN. 
CC -I- SIMILARITY: CONTAINS 1 RASGEF DOMAIN. 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL 
the European Bioinformatics Institute. There are no restriction) 
use by non-profit institutions as long as its content is in. 
modified and this statement is not removed. Usage by and for 
entities requires a license agreement (See http://www.isb-sib.ch/^nnounce/ 
or send an-email to license@isb-sib.ch). 



outstation ■ 
on its 
no way 
commercial 



DR 
DR 
t DR 
- DR 
• DR 
' DR 

I 



EMBL; Z11574; CAA77662.1; -. 
EMBL; Z11578; CAA77665.1; 
PDB; 1PMS; 15-MAY-97. 
MGD; MGI: 98354; 39S1. 
INTERPRO; IPR000219; -. 
INTERPRO; IPR000651; -. 
INTERPRO; IPR001849; -. 
INTERPRO; IPR001895; -. 
PFAM; PF00169; PH; 1. 
PFAM; PF00617; RasGEF; 1. 
PFAM; PF00618; RasGEFN; 1. 
PFAM; PF00621; RhoGEF; 1. 
PROSITE; PS00720; GDS.CDC25; 1. 
PROSITE; PS50003; PH_DOMAIN; 1. 
Guanine-nucleotide releasing factor; 3D-structure. 



DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 



202 443 

444 548 PH, 

777 963 RASGEF. 

1244 1247 POLY-PRO. 

1319 AA; 150882 MW; 3286088A5BAOA4A6 CRC64; 



Query Match 10.9%; Score 83; DB 1; Length 1319; 

Best Local Similarity 25.2%; Pred. No. 17; 

Matches 31; Conservative 20; Mismatches 44; Indels 28; Gaps 5; 

Qy 35 PRPTSPVSTDSNMSAW- - IQKARPARKQRHQPGHLRREAYADDLP-PPPVPP 84 

I I ! I::::: :l II: : 1-1 llllll 
Db 1089£ PPPASGTSSNTDVCSVFDSDHSASPFHSRSASVSSISLSKGTDEVPVPPPVPPRRRPESA 1148 



Qy 851 PAIRSPTVQSKAQLEVRPVMVPRLASIEARTDRSSDRKGGSYKGREALDGRQVTDLRTNP , 

Til ||: |: |:|: : :| : I I ::| II: 

Db 114* PAESSPSKIMSRHLDSPPAIPPRQPTSKAYSPRYS ISD-RTSI 



^ 119| 



145 SDP 147 

III 

11931 SDP 1193 



144 
1190 



STANDARD; 



PRT; 601 AA. 



RESULT 9 
3BPl_MOUSE 
ID 3BP1J10USE 
AC P55194; 

01-OCT-1996 (Rel. 34, Created) 
01-OCT-1996 (Rel. 34, Last sequence update) 
01-NOV-1997 (Rel. 35, Last annotation update) 
SH3 -BINDING PROTEIN 3BP-1. 
SH3BP1 OR 3BP1, 
Mus musculus (Mouse). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus, 
[1] 

SEQUENCE FROM N.A. 
MEDLINE=95347339; PubMed-7621827; 

Cicchetti P., Ridley A.J,, Zheng Y., Cerione R.A., Baltimore D. ; 
"3BP-1, an SH3 domain binding protein, has GAP activity for Rac and 
inhibits growth factor -induced membrane ruffling in fibroblasts."; 
EMBO J, 14:3127-3135(1995). 
[2] 

SEQUENCE OF 263-601 FROM N.A. 



MEDLINE-92358242; PubMed=1379745; 

Cicchetti P., Mayer B.J., Thiel G., Baltimore D.; 

"Identification jof a protein that binds to the SH3 region of Abl and 

is similar to Bcr and GAP-rho."; 

Science 257:803-806(1992). 

-I- FUNCTION: BINDS DIFFERENTIALLY TO THE SH3 DOMAINS OF CERTAIN 
PROTEINS OF SIGNAL TRANSDUCTION PATHWAYS. THIS PROTEIN BINDS 
PREFERENTIALLY TO C-ABL PROTO'ONCOGENE, SRC AND GRB2. SHOWS 
GAP ACTIVITY FOR RAC-RELATED PROTEINS BUT NOT FOR RHO- OR 
RAS- RELATED ,. PROTEINS . IT INHIBITS PDGF-INDUCED MEMBRANE RUFFLING 
MEDIATED BY RAC. 

■I- TISSUE SPECIFICITY: EXPRESSED IN ALL TISSUES EXAMINED. HIGHEST 
LEVELS FOUND IN SPLEEN AND BRAIN, LOWEST IN HEART AND LIVER. 

-!- similarity: Contains l sh3-binding domain. 

-I- SIMILARITY:, CONTAINS 1 GAP DOMAIN. 

-1- SIMILARITY: I SOME, TO HUMAN BCR AND N-CHLMAERIN. 



This SWISS-PROT : 'entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch) . 



EMBL; X87671; CAA61011.1; 
MGD; MGI: 104603; SH3BP1. 
INTERPRO; IPR000198; -. 
PFAM; PF00620; RhoGAP; 1. 



KW 


GTPase activation; 


SH3 


binding. 


FT 


DOMAIN 


. 113, 


117 


POLY-GLU. 


FT 


DOMAIN 


218 •■ 


221 


POLY,- LEU. 


FT 


DOMAIN ' 


193. 


358 


GAP DOMAIN. 


FT 


DOMAIN 


529 , 


537 


SH3 -BINDING. 


FT 


DOMAIN 


582 


585 


POLY -PRO, 


SQ 


SEQUENCE 


601 AA; 


65260 MW; 0FBBF357EEB02ECE CRC64 



Query Match 10.8%; Score 82; DB 1; Length 601; 

Best Local Similarity 23.1%; Pred. No. 8.8; 

Matches 39; Conservative 21; Mismatches 59; Indels 50; Gaps 

Qy 1 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTS PVSTDSNMSAWIQK 54 

I I I i :| I: :| hill 11- : : 

Db 427 APATTPAPTLAPASMAVRERTEA DLPKPTSPKVSRNPTETAASAEDMTRKT 477 

Qy 55 ARPAKKQ- - - : KHQPGHLRREAYADDLPP PPVPPPAIK 88 

III : II I I III I :IM 

Db 478 KRPAPARPTMPPPQPSSTRSSPPAPSLPPGSVSPGTPQALPRRLVGTSLRAPTMPPPLPP 537 

Qy 89 SPTVQSKAQLEVRPVMVPRLASIEARTDR- - -SSDRKGGSYKGREALDG 134 

I :: I II :::: |: 1= : II :| lh I 
Db 538 VPPQPARRQSRRLPAS - PVISNMPAQVDQGVATEDR EGPEAVGG 580 



RESULT 10 
FURIJOVIN 

ID FURIJOVIN STANDARD; PRT; 797 AA. 

AC Q28193; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 01-NOV-1997 (Rel. 35, Last sequence update) 

DT 30-MAY-2000 (Rel. 39, Last annotation update) 

DE FURIN PRECURSOR (EC 3,4.21,75) (PAIRED BASIC AMINO ACID RESIDUE 

DE CLEAVING ENZYME) (PACE) (DIBASIC PROCESSING ENZYME) (TRANS GOLGI 

DE NETWORR PROTEASE FURIN) , 

GN PACE OR FUR. 

OS Bos taurus (Bovine) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Bovinae; Bos. 



RP SEQUENCE FROM N; A. 
RC TISSUE-KIDNEY; ' 



Best Available Copy 
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MEDLINE-95105228; PubMed-7806563; 

Vey M. ; Schaefer w., Berghoefer S., Klenk H., Garten W.; 

"Maturation of the trans -Golgi network protease furin: 

compartmentalization of propeptide removal, substrate cleavage, and 

COOH-terminal truncation."; 

J. Cell Biol. 127:1829-1842(1994). 

-!- FUNCTION: FURIN IS LIKELY TO REPRESENT THE UBIQUITOUS ENDOPROTEASE 
ACTIVITY WITHIN CONSTITUTIVE SECRETORY PATHWAYS AND CAPABLE OF 
CLEAVAGE AT THE RX(R/R)R CONSENSUS MOTIF. 

-I- CATALYTIC ACTIVITY: RELEASE OF MATURE PROTEINS FROM THEIR 

PROPROTEINS BY CLEAVAGE OF ARG-XAA-YAA-ARG-|-ZAA BONDS, WHERE XAA 
CAN BE ANY AMINO ACID AND YAA IS ARG OR LYS. RELEASES 1 ALBUMIN, 
COMPLEMENT COMPONENT C3 AND VON WILLEBRAND FACTOR FROM THEIR 
RESPECTIVE PRECURSORS. 

■!- COFACTOR: CALCIUM-DEPENDENT (BY SIMILARITY). 

-!- ENZYME REGULATION: COULD BE INHIBITED BY THE NOT SECONDLY CLEAVED 



-!- SUBCELLULAR LOCATION: SEEMS TO BE LOCALIZED INTRACELLULARLY TO THE 
TRANS GOLGI NETWORK. PROPEPTIDE CLEAVAGE IS A PREREQUISITE FOR 
EXIT OF FURIN MOLECULES OUT OF THE ENDOPLASMIC RETICULUM (ER), 
SECOND CLEAVAGE IN THE PROPEPTIDE OCCUR IN THE TRANS GOLGI NETWORK 
(TGN), WHICH IS FOLLOWED BY THE RELEASE OF THE PROPEPTIDE BOUND TO 
FURIN AND THE ACTIVATION OF FURIN. 

-I- TISSUE SPECIFICITY: SEEMS TO BE EXPRESSED UBIQUITOUSLY. 

-!- DOMAIN: CONTAINS A HOMO B DOMAIN, ALSO KNOWN AS P OR MIDDLE DOMAIN 
AND A SUBTILISIN-LIKE CATALYTIC DOMAIN, ESSENTIAL DOMAINS FOR 
CATALYTIC ACTIVITY. 

-!- DOMAIN: CONTAINS A CYTOPLASMIC DOMAIN RESPONSIBLE FOR ITS TGN 
LOCALIZATION AND RECYCLING FROM THE CELL SURFACE. 

-I- PTM: THE PROPEPTIDE IS flUTOCATALYTICALLY REMOVED THROUGH AN 
INTRAMOLECULAR CLEAVAGE (PROBABLY IN THE ENDOPLASMIC RETICULUM 
(ER) . IN THE TGN THE SECOND CLEAVAGE IN THE PROPEPTIDE COULD LEAD 
TO THE ACTIVATION OF FURIN. 

-!- SIMILARITY: BELONGS TO PEPTIDASE FAMILY S8; ALSO KNOWN AS THE 
SUBTILASE FAMILY. HIGH SIMILARITY WITH OTHER FURIN-LIKE ENZYMES. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics jlnstitute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement lis not removed. Usage by and for commercial 
entities requires a license j agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license?isb-sib.ch). 



EMBL; X75956; CAA53569.1; 
ftsSP; Q99405; 1MPT. 
MEROPS; S08.071;*. 
INTERPRO; IPR000209; -. 
INTERPRO; IPR002884; 
PFAM; PF01483; P; 1. 
PFAM; PF00082; Peptidase_S8;| 1. 
PRINTS; PR00723; SUBTILISIN. 
PROSITE; PS00136; SUBTILASE JSP; 1. 
PROSITE; PS00137; SUBTILASEJIS; 1. 
PROSITE; PS00138; SUBTILASE.SER; FALSE.NEG. 



KW 


Hydrolase 


; Serine 


proteas 


e; Transmembrane; Glycoprotein; Signal; 


KW 


Zymogen. 








FT 


SIGNAL 


1 


24 


POTENTIAL. 


FT 


PROPEP 


25 


107 


BY SIMILARITY. 


FT 


CHAIN 


108 


797 


FURIN. 


FT 


DOMAIN 


556 


708 


CYS-RICH, 


FT 


TRANSMEM 


719 


741 


POTENTIAL. 


FT 


ACT_SITE 


153 


153 


CHARGE RELAY SYSTEM (BY SIMILARITY) 


FT 


ACTJITE 


194 


194 


CHARGE RELAY SYSTEM (BY SIMILARITY) 


FT 


ACTJITE 


368 


368 


CHARGE RELAY SYSTEM (BY SIMILARITY) 


FT 


DISULFID 


211 


360 


POTENTIAL. 


FT 


DISULFID 


303 


333 


POTENTIAL. 


FT 


CARBOHYD 


387 


387 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


440 


440 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


553 


553 


N-LINKED (GLCNAC, . .) (POTENTIAL) . 


FT 


SITE 


70 


75 


CLEAVAGE (SECOND AUTO-). 


FT 


SITE 


104 


107 


CLEAVAGE (FIRST AUTO-). 


FT 


SITE 


498 


500 


CELL ATTACHMENT SITE (POTENTIAL), 



FT SITE 762. 765 CELL SURFACE SIGNAL. 

FT SITE 776 782 TRANS GOLGI NETWORK SIGNAL. 

SQ SEQUENCE 797 AA; 87250 MW; 466F28EC0246C3D2 CRC64; 



Query Match 10.8%; Score 82; DB 1; Length 797; 

Best Local Similarity 27.14; Pred, No. 12; 

Matches 23; Conservative 17; Mismatches 33; Indels 12; Gaps 3; 

Qy 30 HAS QCPRPTSPVSTDSNMSAWIQKARPAKKQKHQPGHLRREAYADDLPPPPVPPP 85 

III I I II :| 1:1 ::: : I l|:: MM III 

Db 645 HASCATCQGPAPTDCLSCPSHASLDPVEQTCSRQSQS- 



86 AIKSPTVQSKAQLEVRPVMVPKLAS 110 

I : :': :: | :|:: : 
699 A-EVATEPRLRADLLPSHLPEWA 721 



RESULT 11 
VL96JRV1 

ID VL96.IRV1 STANDARD; PRT; 867 AA, 
AC P22856; 

DT 01-AUG-1991 (Rel. 19, Created) 

DT 01-AUG-1991 (Rel. 19, Last sequence update) 

DT 01-MAY-1992 (Rel. 22, Last annotation update) 

DE L96 PROTEIN. 

GN L96. 

Tipula iridescent virus (TIV) (Insect iridescent virus type 1). 
Viruses; dsDNA viruses, no RNA stage; Iridoviridae; Iridovirus. 
[1] 

SEQUENCE FROM N.A. 
MEDLINE-91078646; PubMed-1701750; 
Home W.A., Tajbakhsh S., Seligy V.L.; 

"Molecular cloning and characterization of a late Tipula iridescent 
virus gene."; 
Gene 94:243-248(1990). 

-I- FUNCTION: MAY BE INVOLVED IN TIV GENOMIC DNA PACKAGING IN A 
MANNER RELATED TO THE GAG POLYPROTEINS OF THE MAMMALIAN VIRUSES. 



OC 



RT 



This SWISS-PROT entry is copyright, It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; M62953; AAA47919.1; -. 

PIR; JH0225; JH9225. 

Repeat; DNA packaging; DNA-binding. 

DOMAIN 697 867 RICH IN HYDROPHOBIC RESIDUES, 

SEQUENCE 867 AA; 96011 MW; F19DBDE8FE5CA103 CRC64; 



Query Match 10.84; Score 82; DB 1; Length 867; 

Best Local Similarity 25.4%; Pred. No. 13; 

Matches 34; Conservative 21; Mismatches 57; Indels 22; Gaps 

Qy 35 PRPTSP VSTDSNMSAWIQK-ARPAKKQKHQPGHLRREAYADDLPPPPVPPP 85 

II ill ' 1:11 : :|::| :| :: I :|| : :| I I 
Db 395 PRSSSPKYKTKDLVTTDESDGEIWKKITRKPKSPRRSPPASVRR-SRTPSVPKSPSARP 453 

Qy 86 AIKSPTVQSK AQLEVR- - - PVMVPKLAS I E ARTDRS SDR - KG GS YKGREALD 133 

llhl-" II 1:111:1: I I II : I 

Db 454 RSKSPSVRAEITDDEGETPPSSVRPKSPSVRPKSPSVRPRSKSPSVRPKSPSARPRSKSP 513 

Qy 134 GRQVTDLRTNPSDP 147 

1 I 

Db 514 SVRSKSPSVRPKSP 527 



RESULT 12 
MYPH_CHICK 



I 



Mon Jan 22 



13:05:00 2001 
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MYPH_CHICK STANDARD; PRT; 537 AA. 
Q05623; 

15-JOL-1999 (Rel. 38; Created) 

15-JUL-1999 (Rel. 38, Last sequence update) 

30-MAY-2000 (Rel. 39, Last annotation update) 

MYOSIN-BINDING PROTEIN H (MYBP-H) (H-PROTEIN) (86 KDA PROTEIN). 
MYBPH. 

Gallus gallus (Chicken). 

Eukaryota; Metazoa; Chordata; Craniate; Vertebrata; Euteleostomi; 
Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 
Gallus. 
[1] 

SEQUENCE FROM N.A., AND SEQUENCE OF 2-37. 

TISSUE-PECTORALIS MUSCLE; 

MEDLINE-93155224; PubMed=7679114; 

vaughan K.T., Weber F.E., Einheber s., Fischman D.A.; 

"Molecular cloning of chicken myosin-binding protein (MyBP) H (86-kDa 

protein) reveals extensive homology with MyBP-C (C-protein) with 

conserved immunoglobulin C2 and fibronectin type III motifs."; 

J. Biol. Chem. 268:3670-3676(1993). 

-I- FUNCTION: BINDS TO MYOSIN; PROBABLY INVOLVED IN INTERACTION WITH 

THICK MYOFILAMENTS IN THE A-BAND. 
-I- TISSUE SPECIFICITY: SKELETAL MUSCLE. SEEMS TO BE ALSO EXPRESSED IN 

THE SLOW TONIC ALD MUSCLE. NOT DETECTED IN GIZZARD OR HEART. 
-!■ SIMILARITY: CONTAINS 2 IMMUNOGLOBULIN- LIKE C2-TYPE DOMAINS. 
-I- SIMILARITY: CONTAINS 2 FIBRONECTIN TYPE III-LIKB DOMAINS. 
-I- SIMILARITY: BELONGS TO THE MYBP FAMILY. 

This SWISS -PRO! entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch) . 

EMBL; L05605; AAA21418.1; -. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR002965; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 2. 
PFAM; PF00047; ig; 2. 
PRINTS; PRQQ014; FNTYPEIII. 
PRINTS; PR01217; PRICHEXTENSN. 

Immunoglobulin domain; Cell adhesion; Muscle protein; Thick filament; 
Repeat. I 
DOMAIN 135 221 FIBRONECTIN TYPE-III. 
DOMAIN 253 312 IG-LIKE C2 : TYPE DOMAIN. 
DOMAIN , 331 416 FIBRONECTIN TYPE-III. 
DOMAIN 458 518 IG-LIKE C2-TYPE DOMAIN. 
CONFLICT 2 2 T -> G (IN AA SEQUENCE). 
CONFLICT 9 9 A -> P (IN AA SEQUENCE). 
CONFLICT 15 15 A •> K (IN AA SEQUENCE) , 
SEQUENCE 



537 AA; 58678 MW; 06C4CFQEFE1DD233 CRC64; 



Query Match 10.7%; Score 81.5; DB 1; Length 537; 

Best Local Similarity 26.34; Pred. No. 8.6; 

Matches 30; Conservative 14; Mismatches 57; Indels 13; Gaps 4; 

Qy 35 PRPTSPVSTDSNMSAWIQKARPAKKQKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQS 94 

Ml: I I II :| I I I : III I :| : 

Db 78 PAPEHPPDAE QPAAPA--AEHAPTPTHEAAPAHEEGPPPAAPAEAPAPEPEP 127 

Qy 95 KAQLEVRPVMVPKLASIEARTDRSSDR-KGGSYKGREALDGRQVTDLRTNPSD 146 

: I I II ::| h I I : I: :||| I : :| 
Db 128 EKPKE • EPPSVPLSLAVEEVTENSVTLTWKAPEHTGKSSLDGYWE ICKDGSTD 180 



RESULT 13 
EPS8JUMAN 
ID EPS8.HUMAN 
AC Q12929; 



01-NOV-1997 (Rel. 35, Created) 
01-NOV-1997 (Rei. 35, Last sequence update) 
15-JUL-1998 (Rel. 36, Last annotation update) 
EPIDERMAL GROWTH FACTOR RECEPTOR KINASE SUBSTRATE EPS8. 
EPS8. 

Homo sapiens (Human) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 

[1] ' 

SEQUENCE FROM Hi A. 

MEDLINE-94366758; PubMed-8084614; 

Wong W.T., Carlomagno F., Druck T., Barletta C, Croce CM., 

Huebner K., Kraiis M.H., di Fiore P.P.; 

"Evolutionary conservation of the EPS8 gene and its mapping to human 
chromosome 12q23-q24."; 
Oncogene 9:3057^3061(1994). ' 

-I- FUNCTION: UPON BINDING TO EGF RECEPTOR ENHANCES EGF-DEPENDENT 
MITOGENIC SIGNALS. CAN BIND MULTIPLE CELLULAR TARGETS. 

-I- TISSUE SPECIFICITY: EXPRESSED IN ALL TISSUES ANALYZED, INCLUDING 
HEART, BRAIN, PLACENTA, LUNG, LIVER, SKELETAL MUSCLE, KIDNEY AND 
PANCREAS. EXPRESSED IN ALL EPITHELIAL AND FIBROBLASTIC LINES 
EXAMINED AND IN SOME, BUT NOT ALL, HEMATOPOIETIC CELLS. 

-I- PTM: PHOSPHORYLATED BY SEVERAL RECEPTOR TYROSINE KINASES. 

-!■ SIMILARITY : : CONTAINS 1 SH3 DOMAIN. 

-I- SIMILARITY : CONTAINS 1 SPLIT PH DOMAIN. 

This SWISS-PROT' entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license?isb-sib,ch) , 

EMBL; U12535; AAA62280.1; -. 
HSSP; Q08509; 1A0J. 
MIM; 600206; -. 
INTERPRO; IPR001452; -. 
PFAM; PF00018; SH3; 1, 
PROSITE; PS50002; SH3; 1. 
SH3 domain; Phosphorylation. 



DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 



69 
210'' 
322 
381, 
421, 
532 
615. 

659; 



129 
213 
325 
414 
440 
591 
651 
664 



PH (FIRST PART). 

POLY- PRO. 

POLY-PRO. 

PH (SECOND PART) . 

PRO-RICH. 

SH3. 

PRO-RICH. 
POLY-SER. 



SQ SEQUENCE 822 AA; 91881 MW; AC5EB1D28B784B3B CRC64; 



Query Match 10.7*; Score 81.5; DB 1; Length 822; 

Best Local Similarity 22,6%; Pred, No, 13; 

Matches 44; Conservative 24; Mismatches 72; Indels 55; Gaps 

Qy 2 QAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKKQ 61 

III : I ': :| : : I II : : :: II: I II 
Db 137 QAVMHSCSYDSV-LALVCKEPTQNKPDLHLFQCDEVKANLISEDIESAISDSK---GGKQ 192 

Qy 62 KHQPGHLRREAYAD-DLPPP PVPPPAIKSPTVQSKA 96 

Mil: II :|| I II : |:|: 
Db 193 KRRPDALRMISNADPSIPPPPRAPAPAPPGTVTQVDVRSRVAAWSAWAADQGDFERPRQY 252 

Qy 97 -QLEVRPVM-- VPKL-ASIEARTDRSSDRKGGSYKGREAL 132 

: I II' : || : || :: | :| : ||: 

Db 253 HEQEETPEMMAARIDRDVQILNHILDDIEFFITKLQKAAEAFSELSKRKK--NKKGKRKG 310 

Qy 133 DGRQVTDLRTNPSDP 147 

I I 111 I 
Db 311 PGEGVLTLRAKPPPP 325 
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RESULT 14 
PSC.DROME 

ID PSC DROME STANDARD; PRT; 1603 AA. 

AC P35820; 

DT 01-JUN-1994 (Rel. 29, Created) 

DT 01-JUN-1994 (Rel, 29, Last sequence update) 

dt 15-JUL-1999 (Rel. 38, Last annotation update) 

DE POSTERIOR SEX COMBS PROTEIN. 

GN PSC. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

RN [1] 

RP SEQUENCE PROM N.A. 

RX HEDLINE-92018190; PubMed-1833647; 

RA Brunk B.Pi, Martha E.C., Sharp E., Adler P.N.; 

K "Drosophila genes Posterior Sex Combs and Suppressor two of zeste 

, encode proteins with homology to the murine bmi-1 oncogene,"; 
Nature 353:351-353(1991). 

CC -!- FUNCTION: THE POLYCOMB GROUP (PC-G) GENES ARE NEEDED TO MAINTAIN 
CC EXPRESSION PATTERNS OF THE HOMEOTIC SELECTOR GENES OF THE 
CC ANTENNAPEDIA (ANTP-C) AND BITHORAX (BX-C) COMPLEXES, AND HENCE FOR 
CC THE MAINTENANCE OF SEGMENTAL DETERMINATION. 

CC -!- SUBCELLULAR LOCATION: NUCLEAR (PROBABLE), 

CC -!- SIMILARITY: CONTAINS A C3HC4 -CLASS ZINC FINGER. 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; X59275; CAA41965.1; -. 
PIR; S17983; S17983. 
FLYBASE; FBgn0005624; PSC i 
INTERPRO; IPR001841; -. 
PFAM; PF00097; zf-C3HC4; 1. ( 
PROSITE; PS00518; ZINC.FINGER.C3HC4; 1, 
Zinc-finger; Developmental protein; DNA-binding; Nuclear protein, 



DOMAIN 


47 


53 


POLY-THR. 


DOMAIN 


83 


88 


POLY-THR. 


DOMAIN 


91 


98 


POLY-THR. 


DOMAIN 


145 


152 


POLY-THR. 


DOMAIN 


184 


202 


POLY-SER. 


ZNJING 


265 


303 


C3HC4-TYPE. 


DOMAIN 


642 


651 


POLY-SER. 


DOMAIN 


1066 


1069 


POLY-GLY. 


DOMAIN 


1185 


1189 


POLY-PRO. 


DOMAIN 


1214 


1217 


POLY-PRO. 


DOMAIN 


1391 


1396 


POLY-PRO, 


DOMAIN 


1458 


1461 


POLY-ALA. 


DOMAIN 


1517 


1520 


POLY-GLY. 


SEQUENCE 


1603 AA; 169999 MW; 77024F4 



Query Match 10.7%; Score 81.5; DB 1; Length 1603; 

Best Local Similarity 33.0%; Pred. No. 27; 

Matches 30; Conservative 12; Mismatches 24; Indels 25; Gaps 6; 

Qy 33 QCPRPTSPVSTDSNMSAW IQKARPAKKQKHQ — PGHLRREAYADDLPP 79 

I h III :| M :| h |:|:: MM I III 

Db 1135 QHPKHKSPV— NNYIEIVKLPDQPQDQVQAAKEAQKRQSPPAAVPGHL AAKLPP 1186 

Qy 80 PPVPPPAIKSPTVQSKAQLEVRPVMVPKLAS 110 

II I II II : I :||:|: 

Db 1187 PP-PSKAIPSP— QHLVSRMTPPQLPKVAT 1213 



RESULT 15 
ERFJUMAN 



ERFJUMAN STANDARD; PRT; 548 AA, 
P50548; 

01-OCT-1996 (Rel. 34, Created) 
01-OCT-1996 (Rel. 34, Last sequence update) 
15-JUL-1998 (Rel. 36, Last annotation update) 
ETS-DOMAIN TRANSCRIPTION FACTOR ERF. 
ERF. 

Homo sapiens (Human). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
[1] 

SEQUENCE FROM N.A. 
MEDLINE-96030784; PubMed-7588608; 

Sgouras D.N., Athanasiou M.A., Beal G.J. Jr., Fisher R.J., Blair D.G., 
Mavrothalassitis G.J.; 

"ERF: an ETS domain protein with strong transcriptional repressor 
activity, can suppress ets -associated tumorigenesis and is regulated 
by phosphorylation during cell cycle and mitogenic stimulation."; 
EMBO J.! 14:4781-4793(1995), 

-I- FUNCTION: POTENT TRANSCRIPTIONAL REPRESSOR THAT BINDS TO THE Hi 
ELEMENT OF THE ETS2 PROMOTER. MAY REGULATE OTHER GENES INVOLVED 
IN CELLULAR PROLIFERATION. 

-!- SUBCELLULAR LOCATION: NUCLEAR. 

-!- PTM: PHOSPHORYLATED BY MULTIPLE KINASES INCLUDING PROBABLY ERK2, 

PHOSPHORYLATION REGULATES THE ACTIVITY OF ERF. 
-!- SIMILARITY: BELONGS TO THE ETS FAMILY. 



This SWISS-PROT entry is copyright, It is produced through a collaboration 
between the Swiss Institute of Bioinformatics and the EMBL outstation - 
the European Bioinformatics Institute, There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.chj*. 

EMBL; U15655; AAA86686.1; -. 
HSSP; Q01543; 1FLI. 
INTERPRO; IPR0Q041S; -, 
PFAM; PF00178; Ets; 1. 
PRINTS; PR00454; ETSDOMAIN, 
PROSITE; PS00345; ETS_DOMAIN.l; 1. 
PROSITE; PS00346; ETS_DOMAIN_2; 1. 
PROSITE; PS50061; ETS_DOMAIN_3; 1. 

Transcription regulation; Repressor; DNA-binding; Nuclear protein; 
Phosphorylation! 
DNAJIND 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
DOMAIN 
M0D_RES 
MUTAGEN 



27 


107 


ETS-DOMAIN. 


166 


171 


POLY-SER. 


290- 


293 


POLY-GLY. 


362: 


373 


POLY-SER. 


418; 


423 


POLY-PRO. 


496 


499 


POLY-GLY. 


526. 


526 


PHOSPHORYLATION (BY ERK2). 


526' 


526 


T->A: LOSS OF A PHOSPHORYLATION SITE. 



SEQUENCE 548 AA; 58776 MW; C93A155394B1EEDD CRC64; 



Query Match 10.6%; Score 80.5; DB 1; Length 548; 

Best local Similarity 23.2%; Pred. No. 11; 

Matches 39; Conservative 18; Mismatches 54; Indels 57; Gaps 

Qy 6 AAAEYAGLKVARRQMQDAAGRRHFHASQCPRP TSPV STDSNMSAWIQK 54 

11111:11 HI I Ml I: I: I : 

Db 330 AFLHYPGLWPQPQRPD KCPLPPMAPETPPVPSSASSSSSSSSSPFKFKL 379 

Qy 55 ARPAKKQKHQPGHLRREAYADD LPPPPVPPPAIKSPTVQSKAQLEV 100 

II :: : : I II I Ml II |::| 
Db 380 QRPPLGRRQRAAGEKAVAAADKSGGSAGGLAEGAGALAPPPPPP QIKV 427 

Qy 101 RPVMVPKLASIEARTDRSSDRRGGSYKGREALDGRQVTDLRTNPSDPR 148 

I: : :| II I : Ml II: |: 

Db 428 EPISEGESEEVEV-TDISDE DEEDGEVFKT PRAPPAPPK 465 



Mon Jan 22 13:05:00 2001 



Search completed: January 22, 2001, 12:29:56 
Job time: 1297 sec 
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GenCore version 4.5 
Copyright (c) 1993 - 2000 Compugen Ltd. 



OM protein - protein search, using sw model 



January 22, 2001, 12:54:12 ; Search time 559.8 

(without alignments) 

30.983 Million cell updates/sec 

US-09-540-245A-20 
761 

1 AQAVAAAAEYAGLKVARRQM REALDGRQVTDLRT NPSDPR 148 



Title: 

Perfect score: 
Sequence: 



Scoring table: BLOSUM62 



Gapop 10.0 , Gapext 0.5 
rched! 374700 seqs, 117207915 residues 



Total number of hits satisfying chosen parameters: 

i 

Minimum JJB seq length: 0 
Maximum DB seq length: 2000000000 

I! 

Post-processing: Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

SPTREMBLJ5:* 



sp_archea:* 

spj)acteria:* 

sp_f ungi ; * 

spjiuman:* 

sp_invertebrate:* 

spjnammal:* 

spjnhc:* 

sp.organelle:* 

sp_phage:* 1 

spjlant:* I 

sp_rodent:* 

sp.virus:* 

sp_vertebrate:* 

spjinclassified:* 



Pred, No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis bf the total score distribution. 



SUMMARIES 



% 

Query 



NO. 


Score 


Match Length 


DB ID 


Description 


1 


761 


100. 


0 1612 


11 089026 


089026 mus musculu 


2 


719 


94. 


5 1651 


11 055005 


055005 rattus norv 


3 


691 


90. 


8 1651 


4 Q9Y6N7 


Q9y6n7 homo sapien 


4 


101 


13. 


3 1344 


11 Q9Z2I4 


Q9z2i4 mus musculu 




99 


13, 


0 739 


5 045408 


045408 caenorhabdi 


I 


97.5 


12. 


B 413 


12 P89474 


P89474 herpes simp 


1 


97.5 


12. 


B* 1262 


5 Q9VWC2 


Q9vwc2 drosophila 


8 


91 


12. 


0 825 


5 Q23484 


Q23484 caenorhabdi 


9 


91 


12. 


0 1017 


10 Q9SS68 


Q9ss68 arabidopsis 


10 


91 


12. 


0 3080 


5 Q9VRY3 


Q9vry3 drosophila 


11 


90.5 


11. 


9 410 


13 Q9IA21 


Q9ia21 heterodontu 


12 


90 


11. 


8 1212 


10 Q9LGT8 


Q91gt8 oryza sativ 


13 


89 


11, 


7 569 


4 Q9UGJ0 


Q9ug j 0 homo sapien 


14 


89 


11. 


7 975 


5 Q9VLA7 


Q9vla7 drosophila 


15 


89 


11. 


7 2936 


5 Q9NKP7 


Q9nkp7 leishmania 


16 


88 


11. 


6 811 


5 Q9V8T0 


Q9v8t0 drosophila 


17 


88 


11. 


6 831 


5 Q9NFL3 


Q9nfl3 drosophila 


18 


88 


11. 


6 980 


11 P97837 


P97837 rattus norv 


19 


87.5 


11. 


5 621 


10 Q9LW19 


Q9M9 arabidopsis 



20 


87 


11.4 


764 


5 Q24708 


Q24708 drosophila 


21 


86.5 


11.4 


296 


10 Q40225 


Q40225 lilium long 


22 


86.5 


11.4 


891 


5 Q93763 


Q93763 caenorhabdi 


23 


86 


11.3 


1650 


5 Q9NES4 


Q9nes4 caenorhabdi 


24 


85,5 


11.2 ■ 


604 


10 Q9LW37 


Q91w37 arabidopsis 


25 


85.5 


11.2 


662 


5 Q21536 


Q21536 caenorhabdi 


26 


85 


11.2 


300 


10 Q9LY35 


Q91y35 arabidopsis 


27 


85 


11.2 


698 


5 044447 


044447 caenorhabdi 


28 


85 


11.2 


716 


5 Q9U2A6 


Q9u2a6 caenorhabdi 


29 


85 


11.2 


1541 


5 015837 


015837 leishmania 


30 


84,5 


11.1 


2089 


4 Q14676 


Q14676 homo sapien 


31 


84,5 


11.1 


2099 


4 Q9Y2W9 


Q9y2w9 homo sapien 


32 


84,5 


11.1' 


2099 


4 Q9UNU8 


Q9unu8 homo sapien 


33 


84.5 


11.1 


3201 


5 Q9W0O2 


Q9w0u2 drosophila 


34 


84.5 


11,1 


4880 


11 Q9JLT1 


Q9jltl rattus norv 


35 


84.5 


11 .1 


5085 


11 Q9JKS6 


Q9jks6 rattus norv 


36 


84 


11.0 


193 


5 076215 


076215 neospora ca 


37 


84 


11.0 


412 


ii UJ.U413 


010415 helicoverpa 


38 


84 


11.0 


754 


4 Q9NUP6 


Q9nup6 homo sapien 


39 


83.5 


11.0 


282 


10 Q40509 


Q40509 nicotiana t 


40 


83.5 


11.0 


365 


5 Q9XUP5 


Q9xup5 caenorhabdi 


41 


83.5 


11.0 


1180 


5 Q9YRM2 


Q9vrm2 drosophila 


42 


83.5 


11.0 


3232 


5 Q9VFF8 


Q9vffB drosophila 


43 


83 


10.9 


495 


4 Q9NV83 


Q9nv83 homo sapien 


44 


83 


10.9 


520 


4 Q9P2K3 


Q9p2k3 homo sapien 


45 


83 


10.9 


787 


3 094096 


094096 pneumocysti 



RESULT 1 
089026 

ID 089026 PRELIMINARY; PRT; 1612 AA. 
AC 089026; 

Ql-NOV-1998 (TrEMBLrel. 08, Created) 
01-NOV-1998 (TrEMBLrel. 08, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
DUTT1 PROTEIN, 
R0B01 OR DDTTl. 

Mus musculus (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 
NCBI TaxID-10093; 
[1] 

SEQUENCE FROM N,A. 
TISSUE-BRAIN; 

Wu M.C., Lowe N., Fordham R., Rabbitts P.; 
"The mouse homologue of human DUTTl/H-robol gene: protein sequence and 
chromosomal location."; 

Submitted (JUL-1998) to the EMBL/GenBank/DDBJ databases, 
EMBL; Y17793; CAA76850.1; -. 
HSSP; P56276; 1TLK. 
MGD; MGI:1274781; Robol. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 3. 
?FAM; PF00047; ig; 5. 

1612 AA; 176406 MW; 5F2988C544796B4B CRC64; 



DT 
DT 
DT 
DE 
GN 
OS 
OC 
OC 



RC 



RT 



Query Match 100.04; Score 761; DB 11 

Best Local Similarity 100.0%; Pred. No. 2.5e-65 
Matches 148; Conservative 0; Mismatches 0 



Length 1612; 

Indels 0; Gaps 0; 



1 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKK 60 
llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 
1355 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKK 1424 

61 QKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSRAQLEVRPVMVPKLASIEARTDRSSD 120 
IIIIIIMIIIIIIIIIIIIIIIIII1IIIIIIIIII1IIIIIIIIIIIIIIIIIIIIII 
1425 QKHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVPKLAS IEARTDRSSD 1484 
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121 RKGGSYRGREALDGRQVTDLRTNPSDPR 148 

llllllllllllllllllllllllllll 
1485 RRGGSYRGREALDGRQVTDLRTNPSDPR 1512 



2 



f 

RT 
RT 



055005 PRELIMINARY; PRT; 1651 AA, 
055005; 

01-JUN-1998 (TrEMBLrel. 06, Created) 
01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 
01-OCT-2000 (TrEMBLrel, 15, Last annotation update) 
TRANSMEMBRANE RECEPTOR ROBOl. 

Rattus norvegicus (Rat). 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Muririae; Rattus. 
NCBI TaxID-10116; 1 
[11 ' 
SEQUENCE FROM N.A, > 
TISSUE-SPINAL CORD; I 
MEDLINE-98117249; PubMed-9458045; 

Ridd T., Brose K., Mitchell K.J., Fetter R.D., Tessier-Lav,igne M., 
Goodman C.S., Tear G.; 

"Roundabout controls axon crossing of the CNS midline and •defines a 
novel subfamily of evolutionarily conserved guidance receptors , n 
Cell 92:205-215(1998). 
EMBL; AF041082; AAC39960.1; -. 
HSSP; P56276; 1TLK . 

); IPR001777; -. 



1 A2452DD46E186B7 CRC64; 



! IPR003006; -. 
PFAM; PF00041; fn3; 3. 
PFAM; PF00047; ig; 5. 
Transmembrane. 

SEQUENCE 1651 AA; 180746 MW; 



Query Match 94.5%; Scor£ 719; DB 11; Length 1651; 

Best Local Similarity 93.24; Predi No. 3e-61; 
Matches 138; Conservative 3; Mismatches 7; Indels 0; Gaps 

Qy 1 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPAKK 60 

Illlllllllllllllllllllllllllllllllllllllll HUM! II 

Db 1404 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAAVIQKARPTRK 1463 

Qy 61 QRHQPGHLRREAYADDLPPPPVPPPAIRSPTVQSKAQLEVRPVMVPRLASIEARTDRSSD 120 

illinium iiiiiiiiiiiiiiimiiiim mi ilium inn 

Db 1464 QKHQPGHLRREAYTDDLPPPPVPPPAIRSPSVQSKAQLEARPIMGPRIASIEARADRSSD 1523 
Oy 121 RRGGSYKGREALDGRQVTDLRTNPSDPR 148 

• miMimiiiiiiiiiimi in 
1524 RKGGSYKGREALDGRQVTDLRTSPGDPR 1551 



RESULT 3 
Q9Y6N7 

ID Q9Y6N7 PRELIMINARY; PRT; 1651 AA. 

AC Q9Y6N7; 

DT 01-NOV-1999 (TrEMBLrel. 12, created) 

DT 01-NOV-1999 (TrEMBLrel. 12, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE ROUNDABOUT 1. 

GN ROBOl. 

OS Homo sapiens (Huirtffh) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI TaxID-9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-98117249; PlMed-9458045; 

RA Ridd T., Brose R., Mitchell R.J., Fetter R.D., Tessier-Lavigne M., 

RA Goodman C.S., Tear G.; 

RT "Roundabout controls axon crossing of the CNS midline and defines a 

RT novel subfamily of evolutionarily conserved guidance receptors."; 



Cell 92:203-215(1998). 
EMBL; AF040990; AAC39575.1; 
HSSP; P56276; 1TLR. 
INTERPRO; IPR001777; -. 
INTERPRO; IPR003006; -. 
PFAM; PF00041; fn3; 3. 
PFAM; PF00047; ig; 5. 
1651 -AA; 180928 MW; 



9D98CD7CAB73074D CRC64; 



Query Match 90.8*; Score 691; DB 4; Length 1651; 

Best Local Similarity 88.5%; Pred. No. 1.5e-58; 

Matches 131; Conservative 5; Mismatches 12; Indels 0; Gaps 

Qy 1 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKARPARK 60 

imillllMlllllllllllllllllllllllllllllllllllll 1:11 Mill 
Db 1404 AQAVAAAAEYAGLRVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNMSAAVMQKTRPAKR 1463 

Qy 61 QRHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSRAQLEVRPVMVPRLASIEARTDRSSD 120 

IIIMIMI! I lllllllllllllllll III 1 1 1 1 1 1 1 : 1 M I I : : 1 1 f I f M I 
Db 1464 LRHQPGHLRRETYTDDLPPPPVPPPAIKSPTAQSRTQLEVRPVWPKLPSMDARTDRSSD 1523 

Qy 121 RRGGSYKGREALDGRQVTDLRTNPSDPR 148 

ill 1 1 1 1 1 m 1 1 1 1 1 1 1:111 III 
Db 1524 RRGSSYRGREVLDGRQWDMRTNPGDPR 1551 



Q9Z2I4 



Q9Z2I4 PRELIMINARY; PRT; 1344 AA. 
Q9Z2I4; 

01-MAY-1999 (TrEMBLrel. 10, Created) 
01-MAY-1999 (TrEMBLrel, 10, Last sequence update) 
01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 
RIG-1 PROTEIN, 
RBIG1, 

Mus mus cuius (Mouse) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mamsalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus, 
NCBI TaxID-10090; 
[1] 

SEQUENCE FROM N.A. 

Yuan S.-S.F., Cox L.A., Dasika G.R., Lee E.Y.-H.P.; 

Submitted (APR-1998) to the EMBL/GenBank/DDBJ databases. 

EMBL; AF060570;' AAD11628.1; -. 

HSSP; P56276; 1TLR. 

MGD; MGI:1343102; Rbigl. 

INTERPRO; IPR001777; -. 

INTERPRO; IPR003006; -. 

PFAM; PF00041; fn3; 3. 

PFAM; PF00047; ig; 5. 

1344 AA; 143439 MW; 8B0060341C49CFEA CRC64; 



Query Match , 13.3%; Score 101; DB 11; Length 1344; 

Best Local Similarity 27.7%; Pred. No. 0.14; 

Matches 38; Conservative 18; Mismatches 59; Indels 22; Gaps 

Qy 29 FHASQCPRPTSPVSTDSNMSAWIQ KARPARRQRHQPGHLRREAYADDLPPP 80 

mi I |:: I I : :|| I | | II 'III 

Db 1191 YHASPSPVPSTASSAPGRTRQVTGEMTPPLHGHRARIRRKPRALP-YRREHSPGDLPPP 1248 

Qy 81 PVPPPAIRSPTVQSKAQLEVRPVMVPRL ASIEARTDRSSDRRGGSYKGREA 131 

1:111 :: ' I I : I: I :| II I |:|: 

Db 1249 PLPPPELRDKLALGSA--GSRQRVFPRARAQWGEESGAGSASRGPTSSQR-GPHPDGKES 1305 

Qy 132 LDGRQVTDLRTNPSDPR 148 

: : -:|: |: 
Db 1306 QGRGRGLEACRSPNSPQ 1322 
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ID 045408 PRELIMINARY; PRT; 739 AA, 

AC 045408; 

DT 01-JUN-1998 (TrEMBLrel. 06, Created) 

DT 01-JUN-1998 (TrEMBLrel. 06, Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 

DE F26H11.4 PROTEIN. 

GN F26H11.4. 

OS Caenorhabditis elegans. 

OC Eukaryota; Metazoa; Neraatoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxID-6239; 

RN tH 

RP SEQUENCE FROM N.A. 

RA Barlow K,; 

RL Submitted (NOV-1996) to the EMBL/GenBank/DDBJ databases, 

RN [2] 

RP SEQUENCE FROM N.A. 

IMEDLINE-94150718; PubMed-7906398; 
Wilson r. ( Ainscough R., Anderson K., Baynes C, Berks M., 
Bonfield J., Burton J., Connell M., Copsey T., Cooper J,, Coulson A., 

RA craxton M., Dear S., Du Z., Durbin R., Favello A,, Fulton L., 

RA Gardner A., Green P., Hawkins T.j Hillier L., Jier M., Johnston L,, 

RA Jones M., Kershaw J., Kirsten J.j Laister N., Latreille P., 

RA Lightning J., Lloyd C, McmurraylA. , Mortimore B., O'Callaghan M., 

RA farsons J., Percy C, Rifken L., Roopra A., Saunders D. ( Shownkeen R., 

RA Smaldon N., Smith A., Sonnhammer E., staden R., Sulston J., 

RA Thierry-Mieg J. , Thomas K., Vaudin M., Vaughan K (/ Waterston R., 

RA Watson A., Weinstock L., Wilkinson -Sproat J., Wohldraan P.; 

RT "2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 

RT elegans."; 

RL Nature 368:32-38(1994). 

DR EMBL; Z81515; CAB04196.1; -. 

SO SEQUENCE 739 AA; 82741 MW; 6DD62F7F212E1839 CRC64; 



Query Match 13.0%; Score 99; DB 5; Length 739; 

Best Local Similarity 27.9*; Pred. No, 0.12; 

Matches 48; Conservative 16; Mismatches 68; Indels 40; Gaps 7; 

Qy 1 AQAVAAAAEYAGLPARRQMQDAAGRRHFHASQ— CPRPTSPVSTDSNMSAWIQRARP 57 

M III ::l I : I II I l|: I I : I ::| 
Db 84 ATTVDTVQEGIGATVDVTLVEDVAEKAHEPASTPNVADRLQKPVARDDNSAPV-'EEANR 141 

Qy 58 AKRQRHQP GHLRREA YAfoLPPPPVPPPAIKSPTVQS 94 

M: I I II I I II II II I : 

Ib 142 RRREGKDPKNSEEARRNSKRGKNGTARRSALRGVTIPPVAASLPCPPPPP PLSER 196 
\ 95 RAQLEVRPVMVPRLASIEARTDRSSDRRGGSYRGREALDGRQVTDLRTNPSD 146 

I : :| : I I : : 1 1 1 1 I : I : I I I I II 
Db 197 KKNVGPKPCVGP AQKNLSSQRKHSSKRNRKGL- LRNVAKLWGRRSD 241 



RESULT 6 
P89474 

ID P89474 PRELIMINARY; PRT; 413 AA. 

AC P89474; 

DT 01-MAY-1997 (TrEMBLrel. 03, Created) 

DT 01-MAY-1997 (TrEMBLrel. 03, Last sequence update) 

DT Ol-NOV-1998 (TrEMBLrel. 08, Last annotation update) 

DE HERPES SIMPLEX VIRUS TYPE 2 (STRAIN HG52), COMPLETE GENOME. 

GN USl, 

OS Herpes simplex virus (type 2), 

OC Viruses; dsDNA viruses, no RNA stage; Herpesviridae; 

OC Alphaherpesvirinae; Simplexvirus , 

OX NCBI TaxID-10310; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-HG52; 

RX MEDLINE-87111457; PubMed-3027242; 

RA McGeoch D.J., Moss H.W., McNab D., Frame M.C; 

RT "DNA sequence and genetic content of the Hindlll 1 region in the short 

RT unique component of the herpes simplex virus type 2 genome: 

RT identification of the gene encoding glycoprotein G, and evolutionary 



RT comparisons."; ■ 

RL J, Gen, Virol. 68:19-38(1987). 

RN [2] 

RP SEQUENCE FROM N.A, 

RC STRAIN-HG52; ■ 

RX MEDLINE-90278430; PubMed-2161906; 

RA Everett R., Fenwick M, ; 

RT "Comparative DNA sequence analysis of the host shutoff genes of 

RT different strains of herpes simplex virus: type 2 strain HG52 encodes 

RT a truncated UL41 product."; 

RL J. Gen. Virol. 71:1387-1390(1990). 

RN [3] 

RP SEQUENCE FROM N'.A, 

RC STRAIN-HG52; . 

RX MEDLINE-92113549; PubMed-1662697 ; 

RA McGeoch D.J., Cunningham C, Mclntyre 6., Dolan A.; 

RT "Comparative sequence analysis of the long repeat regions and 

RT adjoining parts of the long unique regions in the genomes of herpes 

RT simplex viruses' types 1 and 2 . " ; 

RL J. Gen. Virol. 72:3057-3075(1991). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN-HG52; ' 

RX MEDLINE-92356101; PubMed-1322965; 

RA Barnett B.C., Dolan A., Telford E.A.R., Davison A.J., McGeoch D.J.; 

RT "A novel herpes. simplex virus gene (UL49A) encodes a putative membrane 

RT protein with counterparts in other herpesviruses."; 

RL J, Gen. Virol. 73:2167-2171(1992). 

RN [5] 

RP SEQUENCE FROM N.A. 

RC STRAIN-HG52; 

RA Dolan A.; 

RL Submitted (FEB-1997) to the EMBL/GenBank/DDBJ databases, 

DR EMBL; Z86099; CAB06708.1; -. 

SQ SEQUENCE 413 AA; 44739 MW; AF828B79CEDDF4A5 CRC64; 



Query Match 12.8V' Score 97.5; DB 12; Length 413; 

Best Local Similarity 27.6V Pred, No, 0.091; 

Matches 27; Conservative 12; Mismatches 32; Indels 27; Gaps 3; 

Qy 76 DLPP PPVPPPAIKS^--PTVQSRAQLEVRPVMVPKLASIEARTDRSS 119 

1:11 II III : I : I :: I :| : :| |: I : 

Db 3 DIPPDPPALNITPANHAPPSPPPGSRRRRRPVLPSSSESEGRPDTESESSSTESSEDEAG 62 

Qy 120 DRRGGSYRGREALDGRQVTDLR TNPSD 146 

I :ll : I II II I III 

Db 63 DLRGGRRRSPRELGGRYFLDLSAESTTGTESEGTGPSD 100 



RESULT 7 
Q9VWC2 

ID Q9VWC2 PRELIMINARY; PRT; 1262 AA. 

AC Q9VWC2; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel . 14, Last annotation update) 

DE CG15619 PROTEIN. 

GN CG15619. 

OS Drosophila melaaogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neop'tera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila, 

OX NCBIJaxID-7227: 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BERKELEY; 

RX MEDLINE-20196005; PubMed-10731132; 

RA Adams M.D., Cehiker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

RA George R.A, , Lewis S.E., Richards S., Ashburner M., Henderson SJ., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M. ( Pfeiffer B.D., 
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RA Wan R.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A. , An H.-J,, Andrews-Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson R.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottler P., 

RA Burtis K.C., Busam D.A., Butler H., Cadleu E., Center A., Chandra I., 

RA cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B,, Delcher A., Deng z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangel ista C.C., Ferraz C, Ferriera S., Fleischmann W., 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., 

RA Glodek A., Gong p., Gorrell J.H., Gu Z., Guan P., Harris M., 

RA Harris N.L, , Harvey D., Heim'an T.J., Hernandez J.R.^Houck J., 

RA Hostin D., Houston K.A., Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M,, Kalush F., Rarpen G.H., Ke Z., Kennison J, A., Retchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky a|.A.< Li J., Li Z., Liang. Y., Lin X., 

RA Liu X,, Mattel B,, Mcintosh [T.C., McLeod M.P., McPherson D,, 

RA Merkulov G. , Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M., Murphy BL, Murphy L., Muzny d,U.) Nelson D.L., 

^A Nelson D.R., Nelson K.A., Nikon K., Nusskern D.R., Pkcleb J.M., 

A\ Palazzolo M., Pittman G.S., Pan S., Pollard J., Purl^V. , Reese M.G., 
Reinert K., Remington R., Sajanders R.D.C., Scheelerf., Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P.J, Smith T,, 

RA Spier E., Spradling A.C., Stapleton M., Strong R., Sun E., 

RA Svirskas R., Tector C, Turnfer R., Venter E,, Wang aIh., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Wbrley K.C., Wu D., Yang's., Yao Q.A., 

RA Ye J., Yen r.-f., zaveri J.sL Zhan M., Zhang G,, Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubih G.M., Venter J.C.; j 

rt 7he genome sequence of Drosbphila melanogaster."; ? 

RL Science 287:2185^L95(2000).| i 

DR EMBL; AE003513; AAF49024.1; [. 

DR FLYBASE; FBgn0031075; CG15619. 

DR INTERPRO; IPR001025; -. 



DR PFAM; PF01426; BAH; 1. 
SQ SEQUENCE 1262 AA; 136383 MW; 601886A9F26068FE CRC64; 



Query Match 12.8%; Score 97.5; DB 5; Length 1262; 

Best Local Similarity 28.4%; Pred. No. 0.28; 

Matches 42; Conservative 14; Mismatches 63; Indels 29; Gaps 

2y 1 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHAS QCPRP TSPVSTDSNMSAV 50 

I I llll : h :| I : I: I I II : |: 

Db 106 AAAAAAAAAHQGVPYSRSM— HAAQMHYSTGYPPNYYAPYPGEVMCYSPPTYPPYFSSK 162 

Dy 51 VIQKARPAKKQKH QPGHLRREAYADDLPPPP VPPPAIKSPTVQSKAQ 97 

>\ I : II I II ill I III ' III : I :l II 
163 VYQPSHPAHPAAHPQGPPPPGAYRRYPYYQHAGPPPPHELYEQQPPPPPVSSGSVSGPAQ 222 

Qy 98 LEVRPVMVPKLASIEARTDRSSDRKGGS 125 

: I I II: I :: II 
Db 223 -QQPPTSSP- -ASVGAAGAGTAGAPSGS 247 



Q23484 

ID Q23484 PRELIMINARY; PRT; 825 AA. 

AC Q23484; 

DT 01-NOV-1996 (TrEMBLrel. 01, Created) 

DI 01-NOV-1996 (TrEMBLrel. 01, Last sequence update) 

DT Ql-JUN-2000 (TrEMBLrel, 14, Last annotation update) 

DE COSMID ZK418. 

GN ZR418.6. 

OS Caenorhabditis elegans, 

OC Eukaryota; Metazoa; Nematoda; Chromadorea; Rhabditida; Rhabditoidea; 

OC Rhabditidae; Peloderinae; Caenorhabditis. 

OX NCBI_TaxID-6239; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BRISTOL N2; 



MEDLINE-94150718; PubMed-7906398; 

Wilson R., Ainscough R., Anderson R., Baynes C, Berks M., 

Bonfield J., Burton J., Connell M., Copsey T., Cooper J., Coulson A., 

Craxton M., Dear S., Du Z., Durbin R., Favello A., Fulton L., 

Gardner A., Green P., Hawkins T., Hillier L., Jier M., Johnston L., 

Jones M,, Kershaw J., Kirsten J., Laister N., Latreille P., 

Lightning J., Lloyd C, Mcmurray A., Mortimore B. , O'Callaghan M,, 

Parsons J., Percy C, Rifken L, , Roopra A., Saunders D. , Shownkeen R. 

Smaldon N,, Smith A., Sonnhammer E., Staden R., Sulston J,, 

Thierry -Mieg J., Thomas R., Vaudin M., Vaughan K. , Waterston R., 

Watson A,, Weinstock L,, Wilkinson-Sproat J., Wohldman P.; 

"2.2 Mb of contiguous nucleotide sequence from chromosome III of C. 



Nature 368:32-38(1994). 
[23 

SEQUENCE FROM NX 
STRAIN-BRISTOL N2; 
Fulton L.; 

Submitted (APR-1994) to the EMBL/GenBank/DDBJ databases. 
13] ; 

SEQUENCE FROM N;A. 
STRAIN-BRISTOL N2; 
Waterston R.; 

Submitted (APR-1994) to the EMBL/GenBank/DDBJ databases. 
EMBL; U00047; AAA50690.1; -. 
INTERPRO; IPR000276; -. 
PFAM; PF00001; 7tm_l; 1. 

825 AA; 93216 MW; E80AFBA281DB9AA8 CRC64; 



Query Match 12.0%; Score 91; DB 5; Length 825; 

Best Local Similarity 24.8%; Pred. No. 0,78; 



1 Matches 


Jy 


2 


Jb 


534 


oy 


50 


Db 


589 


» 


107 


Db 


636 



I II II : :h:|:| :| I I 



--QCPRPTSP — VSTDSNMSA 49 
I :| :h II 



I ! : : I : I I : : 1 1 1 1 1 : 1 1 I : I : I I 
■-RRRRHR-GATRTRSISSPPLPPPAPPNSFMSPDNQTLWCKLYEQTCP— P 635 



: I: : : II II 
■-NLPGQIGMSELEPEDLEVTPDPP 669 



Q9SS68 PRELIMINARY; PRT; 1017 AA. 
Q9SS68; 

01-MAY-2000 (TrEMBLrel. 13, Created) 

01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

Ol-JUN-2000 (TrEMBLrel. 14, Last annotation update) 

PUTATIVE PHOSPHORIBOSYLANTHRANILATE TRANSFERASE. 
T12J13.4. 

Arabidopsis thaliana (Mouse-ear cress). 

Eukaryota; viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 

Magnoliophyta; eudicotyledons; core eudicots; Rosidae; eurosids II; 

Brassicales; Brassicaceae; Arabidopsis. 

NCBI TaxID-3702; 

[1] 

SEQUENCE FROM N,A. 
STRAIN-CV. COLUMBIA; 

Lin X., Raul S., Town CD., Benito M. , Creasy T.H., Haas B,, 

Ronning CM., Koo H., Fujii C.Y., Utterback T.R., Barnstead M.E., 

Bowman C.L., White 0., Nierman W.C., Fraser CM,; 

"Arabidopsis thaliana chromosome III BAC T12J13 genomic sequence,"; 

Submitted (OCT-1999) to the EMBL/GenBank/DDBJ databases. 

EMBL; AC009327; .'AAP03465.1; -. 

INTERPRO; IPR000008; -. 

PFAM; PF00168; C2; 3. 

PROSITE; PS50004; C2 D0MAIN.2; 3. 
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RW Transferase. 

SQ SEQUENCE 1017 AA; 114265 MW; ECDB92D86E279C94 CRC64 ; 



Query Match 12.0%; Score 91; DB 10; Length 1017; 

Best Local Similarity 20.3%; Pred. No. 0.97; 

Matches 32;' Conservative 26; Mismatches 46; Indels 54; Gaps 7; 

Qy 19 QMQDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQRARPA 58 

I:: I : :: : M :| :|: I : :| 
Db 121 QIRGEIGLKAYYVDENP-PAAPAATEPKPEAAAATEERPPEIARAEDGRRETEAAKTEEK 179 

Qy 59 — KRQKHQPGHLRREAYAD — DLPP PPVPPPAIKSPTVQSKA- ■ 96 

II:: :! : I! I I II II II :|:| : II 

Db 180 KEGDKKEEEKP— KEEAKPDEKRPDAPPDTKARRPDTAVAPPPPPAEVKNPPIPQKAET 236 

4 97 — QLEVRPVMVPRL — ASIEARTDRSSDRKGGSY 126 
:| "I I : : :l : :|| I 
237 VKQNELGIKPENVNRQDLIGSDLELPSLTRDQNRGGGY 274 



RESULT 10 
Q9VRY3 

ID Q9VRY3 PRELIMINARY; PRT; 3080 AA. 

AC Q9VRY3; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-JUH-2000 (TrEMBLrel. 14, Last annotation update) 

DE CG10115 PROTEIN. 

GN CG10115. 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBIJaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=BERKELEY; 

RX MEDLINE=20196006; PubMed-10731132; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 

» R.A., Lewis S.E., Richards S., Ashburner M,, Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R.G., Champe M., Pfeiffer B.D., 
Wan R.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

IAbril J.F., Agbayani A., An H.-J., Andrews-Pfannkoch C, Baldwin D., 
Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 
Beeson K.Y., Benos P.V., Berman B.P., Bhandari D,, Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B,, Delcher A., Deng Z., Mays A.D., Dew I,, Dietz S,M,, 

RA Dodson K., Doup L.E., Domes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangel ista C.C., Ferraz C, Ferriera S., Fleischmann W., 

RA Fosler c, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K,, 

RA Glodek A., Gong F,, Gorrell J.H., Gu Z., Guan P., Harris M., 

RA Harris N.L., Harvey D., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston R.A., Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M., Kalush F., Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A., 

RA Kimmel B.E., Rodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y., Levitsky A. A., Li J., Li Z., Liang Y., Lin X., 

RA Liu x., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M., Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson R.A., Nixon R,, Nusskern D.R., Pacleb J.M., 

RA falazzolo M., Pittman G.S., Pan S., Pollard J., Puri V,, Reese M.G., 

RA Reinert R., Remington R., Saunders R.D.C., Scheeler F., Shen H., 

RA Shue B.C., Siden Alamos I., Simpson M, , Skupski M.P., Smith T., 

RA spier E., Spradling A.C., Stapleton M., Strong R., Sun E., 

RA Svirskas R., Tector C, Turner R., Venter E,, Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley R.C., Wu D., Yang S., Yao Q.A., 

RA Ye J,, Yeh R.-F., Zaveri J.S., Zhan M., Zhang G., Zhao Q., Zheng I,, 



RA 



RA 



RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster,"; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003561; AAF50647.1; -. 

DR FLYBASE; FBgn0035712; CG10115. 

DR INTERPRO; IPR000873; ■. 

DR INTERPRO; IPH001487; -. 

DR PFAM; PF00439; bromodomain; 1. 

DR PRINTS; PS00503; BROMODOMAIN. 

DR PROSITE; PS00455; AMP.BINDING; 1. 

DR PROSITE; PS50014; BROMODOMAIN 2; 2. 

3080 AA; 338176 MW; 4602054387DE2C12 CRC64; 



Query Match 12.0%; Score 91; DB 5; Length 3080; 

Best Local Similarity 21.7%; Pred, No. 3; 

Conservative 26; Mismatches 57; Indels 54; Gaps 



Matches 


Qy 


9 


Db 


878 


Qy 


69 


Db 


927 


Qy 


112 


Db 


979 



: I , :|| || 
KYFSKKAKEETERDAPGR- 



-PAVSSAEEDLSEIEAEAPQKAQKRKRKEKDK 926 



■ -EAYADDLPPPPVPPPAIRSPTVQSKAQLEVRPVMVPKLASI 111 

II : llll II : II I :: : I I 
)MEAEREPTPPPP-PPTSKRSRTSRTGRERE KDKER 978 



I h |:: hi lh 



■-TDLRTNPSDPR 148 

: ::||| I 



RESULT 11 
Q9IA21 

ID Q9IA21 PRELIMINARY; PRT; 410 AA. 
AC Q9IA21; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCI-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE HOXA3. 

GN HOXA3. 



Heterodontus francisci (Horn shark) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Chondrichthyes; 
Elasmobranchii; Galeomorphii; Heterodontoidea; Heterodontidae; 



NCBI.TaxID-7792; 

[1] 



N.A. 



RX MEDLINE-20144096; PubMed-10677514; 

RA Rim C.B., Amemiya C, Bailey W., Rawasaki R., Mezey J., Miller W., 

RA Minoshima S., Shimizu N. , Wagner G., Ruddle F.; 

RT "Hox cluster genomics in the horn shark, Heterodontus francisci/; 

RL Proc. Natl. Acad. Sci. U.S.A. 97:1655-1660(2000). 

DR EMBL; AF224262;- AAF44641.1; -. 

SQ SEQUENCE 410 AA; 44548 MW; 285ABC06B41C9FD9 CRC64; 



Query Match 11.9%; Score 90.5; DB 13; Length 410; 

Best Local Similarity 21.8%; Pred. No. 0.43; 

Matches 36; Conservative 24; Mismatches 60; Indels 45; Gaps 

Qy 21 QDAAGRRHFHASQCPRPTSPVSTDSNMSAWIQKA RP ARRQRH 63 

, I I I : : I 1:1 I :| : I :| :| I : I 

Db 19 QGANGFNYNASQQQYPPSSHVESDYHRPACSLQSPGTVPHHKPNDINESCMRTSASQPSH 78 

Qy 64 QPGHLRREAYADDLPPPPVPPPAI KSPTVQSKAQL--EVRPVM- 104 

I :: llll III:: 1 : 1 1 : hi : :: I I 

Db 79 HPVIAEQQQQKQPPPPPPPPPPSVSPPQNTSSNSTQSSTSKNPTLTSQATISRQIFPWMK 138 

Qy 105 VHLASIEARTDRSSDRRGGSYKGREALDGRQVTDL 140 

■ :hh I I : I I h :| 
Db 139 ESRQNARQRTSSSSSVESSAGEKSPPGPASRRARTAYTSAQLVEL 183 
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RESULT 12 ** 
09LGT8 

ID Q9LGT8 PRELIMINARY; PRT; 1212 AA. 

AC Q9LGT8; 

DT 01-OCT-2000 {TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

DE P0489A01.2 PROTEIN. 

GN P0489A01.2. 

OS Oryza sativa (Rice), 

OC Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta; 

OC Magnoliophyta; Liliopsida; Poales; Poaceae; Oryza. 

OX NCBI_TaxID»4530; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-CV. NIPPONBARE; 

RA Sasaki T., Matsumoto T., Yamamoto K.; 

RT "Oryza sativa nipponbare(GA3) genomic DNA, chromosome 1, PAC 

RT clone ;P0489A01,"; 

RL Submitted (JUN-2000) to the EMBL/GenBank/DDBJ databases. 

k " EMBL; AP002484; BAA99512.1; •. 

SEQUENCE 1212 AA; 137542 MW; E0E82364083E407C CRC64; 



Query Match 11.8*; Score 90; DB 10; Length 1212; 

Best Local Similarity 28 .9%; Pred. No. 1,4; 

Matches 43; Conservative 13; Mismatches 53; Indels 40; Gaps 7; 



Qy 


13 


Db 


10 


Qy 


68 


Db 


63 


Qy 


116 


Db 


119 



I ::H :!!! I! 



I I I II I :| : I I I 
-RATPPPSTPRNAAAACQRHAAAAVATPPPPSPPCAQC 62 



LRREAYADDLPPPP- ■ -VPPP AIKSPTVQSKAQLEVRPVMVP" - -KLASIEART 115 

II I llll III I :| : I : I || 



:||::|: 



I: I III 
- -RERTRQRTLP 135 



13 



Q9UGJ0 

ID Q9UGJ0 PRELIMINARY; PRT; 569 AA. 

AC Q9UGJ0; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-JUN-2000 (TrEMBLrel. 14, Last annotation update) 

^ AMP-ACTIVATED PROTEIN KINASE GAMMA2 SUBUNIT. 

■ AMPR GAMMA2, 

QP Homo sapiens (Human) , 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo, 

OX NCBUaxID-9606; 

RN . [1] 

RP SEQUENCE FROM N.A. 

RA Cheung P.C.F., Salt I. P., Davies S.P., Hardie G.D., Carling D.; 

RT "Characterization of AMP-activated protein kinase gamma subunit 

RT isoforms and their role in AMP binding,"; 

RL Submitted (OCT-1999) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AJ249976; CAB65116.1; -. 

DR INTERPRO; IPR000644; -. 

DR PFAM; PF00571; CBS; 4. 

KW Kinase. 

SQ SEQUENCE 569 AA; 63066 MW; F51C30668C294089 CRC64; 



Query Match 11.7*; Score 89; DB 4; Length 569; 

Best Local Similarity 25.2*; Pred. No. 0,83; 

Matches 30; Conservative 13; Mismatches 56; Indels 20; Gaps 3; 
Qy 25 GRRHFHASQCPRPTS PVSTD - SNMSAWIQKARPAKKQKHQPGHLRREAYADDLPP — 79 



I I II H II :h I : I I I : Ihl I II II 
Db 146 GIRFFSRS- - rRKTSGLSSSPSTPTQVTKQHTFPLESYKHEPERLENRIYASSSPPDTGQ 202 

Qy 80 "PPVPPPAIKSPTVQSRAQLEVRPVMVPKLASIEARTDRSSDRRGGSY 126 

Ml: I :|: : : I I :| : I : I I 
Db 203 RFCPSSFQSPTRPPLASPTHYAPSKAAALAAALGPAEAGMLEKLEFEDEAVEDSESGVY 261 



RESULT 14 
Q9VLA7 

ID Q9VLA7 PRELIMINARY; PRT; 975 AA. 

AC Q9VLA7; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last annotation update) 

DE CG4405 PROTEIN. , 

GN CG4405. 

OS Drosophila mela'nogaster (Fruit fly) . • 

OC Eukaryota; Metazoa; Arthropoda; Trachea ta; Hexapoda; Insecta; 

OC Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI TaxID-7227; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-BERKELEY; 

RX' MEDLINE-20196006; PubMed-10731132; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P,W,, Hoskins R.A., Galle R.F., 

RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers y.-h.c, Blazej R.G., Champe m., pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A,, An H.-J., Andrews -Pfannkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P. v., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D,, Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K., Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W,, 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K., 

RA Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M. , 

RA Harris N.L., Harvey D., Heiman T.J,, Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A., Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M., Kalush F., Karpen G,H,, Ke Z., Kennison J. A., Ketchum K.A., 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai z., 

RA Lasxo P., Lei Y,, Levitsky A, A,, Li J., Li I,, Liang Y., Lin X,, 

RA Liu X., Mattel B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G., Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M., Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo'M., Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K., Remington K., Saunders R.D.C., Scheeler F., shen H., 

RA Shue B.C., SiderrKiamos I., Simpson M, , Skupski M.P., Smith T., 

RA Spier E., Spradling A,C, Stapleton M,, Strong R., Sun E., 

RA Svirskas R. , Tector C, Turner R., Venter E., Wang A.H., Wang X., 

RA Wang Z,-Y., Wassarman D.A., Weinstock G.M,, Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D, ( Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M, , Zhang G., Zhao Q., Zheng L. ( 

RA Zheng X,H., Zhong F.N., Zhong W., Zhou X., zhu S,, Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

DR EMBL; AE003625; .AAF52787.1; -. 

DR FLYBASE; FBgn0032129; CG4405. 

SQ SEQUENCE 975 AA; 106813 MW; 52B89D12F7B60AA4 CRC64; 



Query Match 11.74; Score 89; DB 5; Length 975; 

Best Local Similarity 24.5*; Pred. No. 1.4; 

Matches 34; Conservative 22; Mismatches 67; Indels 16; Gaps 4; 
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Qy 1 AQAVAAMEYAGLKVARRQMQDAAGRRHFHASQCPRPTS PVSTDSNMSAWIQK 54 

II I : I : ::l I : I hll I I III :::| :| I 
Db 529 AYATQAQRQQQYLNMQQQQQQQRRMSQQKHESECPVPPSMYSQQLPVSPQTDLSGMVSQ- 587 

Qy 55 ARPAKKQ KHQPGHLRREAYADDLPPPPVPPPAIKSPTVQSKAQLEVRPVMVP 106 

|:|: I III::: I : II : I I :: I : 

Db' 588 AQPSHAQVISAINQMYHQQQHQQQQQLNQQQSNPQM-NPAYQFQQQQPPGQAKINPNLNA 646 

Qy 107 KLASIEARTDRSSDRKGGS 125 

I :| : II 
Db 647 NLKQNQAPNQNQTQMPFGS 665 



RESULT 15 
09NKP7 

ID Q9NKP7 PRELIMINARY; PRT; 2936 AA. 

AC Q9NKP7; 

4ft, Ol-OCT-2000 (TrEMBLrel. 15, Created) 

^1 Ol-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

i^W Ol-OCT-2000 (TrEMBLrel. 15, Last annotation update) 

* DE L712.2. 

GN L712.2. 

OS Leishmania major, 

OC Eukaryota; Euglenozoa; Kinetoplastida; Trypanosomatidae; Leishmania. 

OX NCBIJaxID-5664; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN-FRIEDLIN; 

RA Myler P.J.; 

RL Submitted (MAR-2000) to the EMBL/GenBank/DDBJ databases. 

■ DR EMBL; AC005930; AAF39727.1; -. 

SQ SEQUENCE 2936 AA; 305212 MW; BE689E280242FE6B CRC64; 



Query Match 11.7%; Score 89; DB 5; Length 2936; 

Best Local Similarity 23.44; Pred. No. 4.5; 

Matches 36; Conservative 25; Mismatches 67; Indels 26; Gaps 5; 

3y 1 AQAVAAAAEYAGLKVARRQMQDAAGRRHFHASQCPRPTSPVSTDSNM S 48 

I III ' : :: :| III :| I II: :ll : I 
)b 477 AATATGAAERQRRAASPKRSRDIAGARAGASSASP-PTTALSTSPPLASLPPVLSPAAES 535 

49 AWIQRARPAKKQKHQPGHLR- -REAYADDLP PPPVPPPAIKSPTVQSKAQLE 99 

I I:: : ::: I Mil I I :| I I : : I I : 
536 AAVLRDLKDVQRRLHAHGHLRGTAAAAAGAMPLEVKAHSPAAAPSNSTRGPESCSPCDIS 595 

100 VRPVMVPRLASIEARTD — RSSDRKGGSYKGR 129 

I II::: : |:| I : : 
596 VAASSVSSTATSSSSSSSAAMKRSERHGSARRNR 629 



Search completed: January 22, 2001, 12:54:19 
Job time: 2060 sec 
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